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Clones  of  the  Hsp70  and  Hsp83  genes  of  the  malaria  vector 

Anopheles  albimanus  Wiedemann  were  isolated  from  a  genomic  DNA  library. 

The  Hsp70  genes  occur  at  two  loci  on  chromosome  2R.    Each  locus  contains 

a  pair  of  divergently-oriented  uninterrupted  reading  frames. 

Transcription  start  sites  were  determined  by  primer  extension.  Maximal 

transcription  was  observed  when  larvae  were  heat  shocked  at  40°C  and  was 

150-  to  350-fold  above  the  level  of  nonshocked  larvae.    The  size  of  the 

transcripts  determined  by  northern  analysis  is  consistent  with  genes 

encoding  70  kiloDalton  (kDa)  proteins.    The  DNA  sequences  of  all  of  the 

interstitial  and  protein-coding  regions  present  were  determined  and 

compared  to  one  another,  and  to  the  Drosophila  mel anogaster  Hsp70  genes. 

The  nucleotide  and  predicted  protein  sequences  were  74%  and  82% 

identical  to  D.  melanogaster  respectively.    Compared  to  D.  mel anogaster, 

mosquito,  the  regulatory  heat-shock  elements  in  the  promoters  were  found 

to  be  more  numerous,  and  to  more  closely  match  the  published  consensus. 


Phylogenetic  analysis  of  the  mosquito  heat-shock  genes 
demonstrates  that  they  are  homologous  to  the  D.  melanogaster  Hsp70  genes 
and  not  to  the  //sp/O-like  cognate  genes.    As  in  D.  melanogaster,  the 
Hsp70  genes  of  A.  albimanus  have  undergone  concerted  evolution  within 
each  locus,  and  to  a  lesser  degree,  between  loci.    The  mosquito  Hsp70 
genes  are  a  more  divergent  family  in  all  regions  sequenced  than  the  D. 
mel anogaster  family. 

The  restriction  map  of  a  clone  containing  two  Hsp83  genes  was 
determined.    The  clone  hybridized  to  only  one  chromosomal  locus  on 
chromosome  3L.    The  clone  contains  a  palindrome,  two  regions  of  which 
hybridize  both  to  cDNA  probes  and  to  a  D.  melanogaster  Hsp83  probe. 
Transcripts  were  found  to  be  present  at  moderate  levels  in  nonshocked 
larvae  and  were  induced  only  several -fold  at  37°C.    The  size  of  the 
transcript  is  consistent  with  a  gene  encoding  a  83  kDa  protein. 

Temperature  effects  on  larval  survival  were  investigated.  Larvae 
were  exposed  to  30  min.  heat  shocks  at  various  temperatures.    Almost  no 
mortality  was  observed  at  40°C,  but  was  complete  at  43°C.  Larval 
thermotolerance  could  be  induced  by  a  30  min.  exposure  to  37°C. 
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CHAPTER  1 
HEAT-SHOCK  GENES 


Introduction 

All  organisms  face  environmental  stresses  that  threaten  their 
homeostasis  and  therefore,  survival.    It  is  logical  that  they  have 
developed  genetic  systems  that  respond  in  biologically  appropriate  ways. 

Genes  that  respond  to  stress  have  only  recently  been  identified 
and  their  functions  studied.    The  heat-shock  genes  fall  into  this 
general  class  called  stress-response  genes  (Atkinson  and  Walden, 
1985)(Pardue  et  a/.,  1989).    They  are  distinguished  by  their 
inducibility  upon  temperature  elevation  or  sequence  similarity  with  such 
genes. 

The  heat-shock  response  can  be  summarized  as  rapid,  reversible, 
heat-induced  synthesis  of  a  small,  specific  set  of  proteins  and 
concurrent  repression  of  synthesis  of  almost  all  other  proteins. 
Similar  heat-shock  responses  are  ubiquitous  in  animals,  plants  and 
microorganisms  and  have  been  thoroughly  reviewed  (Ashburner  and  Bonner, 
1979)(Schlesinger  et  a/.,  1982)(Craig,  1985) (Lindqui st,  1986) 
(Lindquist  and  Craig,  1988).    The  most  extensive  work  has  been  done  in 
Drosophila  mel anogaster  due  to  the  ease  of  manipulation  and  wealth  of 
genetic  information  available. 
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Discovery  of  Heat-Shock  Genes 
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In  1962,  the  first  observations  were  published  which  hinted  that 
heat-inducible  genes  existed.    Ritossa  (1962)  noted  that  certain  regions 
of  the  polytene  chromosomes  of  Drosophila  bucksii  puffed  rapidly  and 
transcribed  RNA  at  a  higher  rate  when  exposed  to  temperature  shocks, 
dinitrophenol  (DNP),  or  sodium  salicylate  treatment,  but  returned  to 
their  normal  form  when  the  treatments  were  removed.    Thereafter,  it  was 
shown  that  in  D.  melanogaster  a  specific  set  of  proteins  appears  upon 
heat  shock  (Tissieres  et  a/.  1974),  a  fact  beautifully  confirmed  and 
extended  by  Lindquist  (1980). 

The  ability  of  DNP  and  sodium  salicylate  to  induce  a  heat-shock 
response  in  D.  bucksii  showed  that  heat  was  not  the  only  inducer.  For 
animals,  numerous  classes  of  inducers  of  the  heat-shock  response  were 
discovered:  oxidizing  agents,  transition  series  metals,  amino  acid 
analogs,  steroid  hormones,  wounding,  and  recovery  from  hypoxia.  Other 
classes  of  inducers  and  specific  effects  have  been  compiled  (Mover, 
1984). 

Heat-Shock  Proteins  in  Drosophila  melanogaster 

Regardless  of  the  organism,  the  heat-shock  proteins  (HSPs)  are 
classified  into  three  groups  according  to  their  molecular  weight:  the 
90,  70,  and  20  kiloDalton  (kDa)  groups  (Pardue,  1988). 

In  D.  melanogaster,  the  major  HSPs  are  HSP83  (90kDa  class),  HSP70 
and  HSP68  (70  kDa),  and  the  small  heat-shock  proteins,  HSP27,  HSP26, 
HSP23,  and  HSP22  (all  20  kDa  class).    Not  all  are  induced  to  the  same 
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level  or  in  the  same  tissues,  nor  is  the  maximal  induction  temperature 
the  same.  As  an  introduction  to  the  various  patterns  of  expression,  I 
will  present  an  overview  of  expression  of  the  three  major  groups  of  D. 
meTanogaster  heat-shock  proteins 

The  Hsp70  genes  have  been  studied  most  extensively  since  their 
transcripts  and  proteins  are  the  most  abundant.    Most  D.  melanogaster 
strains  contain  five  copies  of  this  gene  per  haploid  genome  (Ish- 
Horowicz  et  al.,  1979) (Mirault,  et  a/.,  1979),  and  these  are  believed  to 
be  coordinately  expressed.    Hsp70  transcription  occurs  at  low  levels  at 
normal  rearing  temperatures  (25°C),  is  induced  only  slightly  at  33°C, 
and  increases  100-  to  1000-fold  within  minutes  at  the  optimal  induction 
temperature  of  37.5°C.    Translation  commences  within  5  minutes,  and 
after  1  hour,  the  heat-shock  proteins  represent  90%  of  the  total  protein 
synthesized  (Lindquist,  1980).    Hsp70  genes  are  expressed  in  most 
tissues  of  the  larva  and  adult  with  the  exception  of  the  brain  and  the 
post-meiotic  cells  of  the  testes  (Bonner  et  al.,  1984). 

In  contrast,  the  Hsp83  gene  (one  copy  per  haploid  genome  (Hackett 
and  Lis,  1983))  is  transcribed  at  moderate  levels  in  flies  grown  at 
normal  temperatures,  is  transcribed  at  a  several -fold  higher  rate  at  33- 
35°C,  and  is  repressed  at  37-38°C  (Lindquist,  1980).    The  tissue- 
specific  distribution  of  the  protein  also  differs  from  that  of  HSP70. 
It  is  generally  expressed  at  moderate  levels  and  in  high  concentration 
in  ovaries  (Mason  et  a/.,  1984).    This  is  the  only  D.  melanogaster  heat- 
shock  gene  that  contains  an  intron  (Hackett  and  Lis,  1983),  the  splicing 
of  which  is  related  to  its  repression  at  high  temperatures  (Yost  and 
Lindquist,  1985). 
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The  20  kDa  class  of  heat-shock  genes  display  yet  another  variety 
of  regulation.    They  are  transcribed  variably,  depending  on  the  stage  of 
development  (Mason  et  a/.,  1984),  are  moderately  induced  by  heat  shock 
but  are  also  induced  by  ecdysone  (Ireland  et  al.,  1982)  (Morganelli  et 
a/.,  1985). 

In  addition  to  the  above  genes  that  are  recognized  as  the  genuine 
D.  melanogaster  heat-shock  genes,  other  heat-shock-related  genes  have 
been  identified  on  the  basis  of  either  heat-shock  inducibility  or 
sequence  similarity.    The  essential  hsromega  genes  are  inducible  and 
located  at  a  major  chromosome  puff  (93D)  but  may  not  be  translated. 
Rather,  the  functional  product  seems  to  be  the  RNAs  (Hovemann  et  a/., 
1986) (Bendena  et  a/.,  1989).    Unlike  the  true  Hsp70  genes,  the  Hsp70 
cognate  genes  contain  introns,  but  show  no  heat  inducibility  (Craig  et 
al.,  1983).    The  alpha-beta  sequences  show  heat-inducible  transcription 
but  are  not  translated  and  probably  have  no  essential  function  (Craig, 
1985). 

All  of  the  above  true  heat-shock  genes  are  located  at  chromosomal 
loci  which  demonstrate  heat-inducible  puffing.    Puffing  is  generally 
correlated  with  increased  rates  of  transcription,  and  this  has  been 
shown  specifically  for  the  heat-shock  genes  (Ritossa,  1962) (Tissieres  et 
al.,  1974) (Compton  and  McCarthy,  1978).    Consistent  with  the  changing 
pattern  of  transcription  during  heat  shock,  RNA  Polymerase  II 
accumulates  in  bands  that  show  heat-induced  puffing  but  not  at  other 
loci  (Bonner,  1981). 

Specific  DNA  sequences  of  heat-shock  genes  are  sufficient  to 
initiate  puffs.    When  hybrid  genes  containing  Hsp70  promoters  are 
integrated  in  the  chromosomes  of  D.  melanogaster,  new  puffs  and 
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transcripts  appear  at  the  loci  where  the  hybrid  genes  are  located  (Lis 
et  al.,  1983) (Bonner  et  a/.,  1984)(Dudler  and  Travers,  1984). 


Transcriptional  and  Translational  Control  Purina  Heat  Shock 


At  elevated  temperatures,  the  heat-shock  genes  are  transcribed  and 
translated  at  higher  rates.    However,  the  expression  of  almost  all  other 
genes  is  reduced  (Tissieres  et  al.,  1974).    This  has  been  attributed  to 
both  transcriptional  and  translational  controls  (reviewed  in  Bienz  and 
Pelham,  (1987)).    Previously  synthesized  transcripts  from  non-heat-shock 
genes  are  not  translated  during  shock,  but  they  are  not  degraded  and  are 
translatable  when  the  cell  returns  to  its  normal  temperature  (Storti  et 
al.,  1980).    An  exception  to  suppression  of  expression  that  has  been 
noted  is  the  histone  genes  of  D.  mel anogaster  (Spadoro  et  al.,  1986). 

The  unique  factors  that  are  necessary  for  preferential 
transcription  of  heat-shock  genes  are  found  in  their  DNA  sequence. 
Pelham  (1982)  first  identified  regulatory  regions  in  the  Hsp70  promoter 
that  are  necessary  for  heat-induced  transcription.    These  14  base  pair 
DNA  sequences,  called  heat-shock  elements  (HSE),  have  been  found  in  the 
promoters  of  all  heat- inducible  genes,  and  the  regulatory  mechanism  of 
heat-shock  genes  appears  to  be  highly  conserved  across  the  animal 
kingdom  (Pelham,  1985) (Bienz  and  Pelham,  1987).     HSEs  are  the  binding 
sites  for  the  trimeric  heat-shock  transcription  factor  (HSTF  or  HSF), 
that  is  necessary  to  induce  transcription  (Parker  and  Topol ,  1984)(Shuey 
and  Parker,  1986)(Perisic  et  al.,  1989).    It  is  present  under  non-shock 
conditions  and  is  believed  to  be  activated  by  phosphorylation  (Zimarino 
and  Wu,  1987)(Sorger  and  Pelham,  1988). 


The  ease  with  which  dramatic  changes  in  transcription  and 
translation  of  heat-shock  genes  can  be  induced  has  made  them  a  model 
system  for  understanding  eukaryotic  gene  regulation  (Pelham,  1985){Bienz 
and  Pelham,  1987).    Also,  the  conserved  nature  of  Hsp70  transcription 
induction  has  been  of  tremendous  benefit  to  studies  of  hybrid  gene 
expression.    This  promoter  is  the  most  commonly  used  for  expression  of 
hybrid  gene  constructs  in  DrosophiTa  and  other  insects  (  e.g.  beta- 
galactosidase  (Lis  et  a/.,  1983)  chloramphenicol  acetyl  transferase 
(DiNocera  and  Dawid,  1983),  and  alcohol  dehydrogenase  (Bonner  et  al., 
1984)).  •    '  ''"^ 

Translational  control  of  heat-shock  gene  expression  allows 
preferential  translation  of  heat-shock  transcripts  over  those  produced 
under  non-shock  conditions.    Like  transcription,  this  discrimination  is 
a  DNA  sequence-specific  effect  and  will  be  discussed  in  Chapter  3. 

The  Functions  of  Heat-Shock  Genes 

A  traditional  approach  to  understanding  the  functions  of  heat- 
shock  genes  is  the  isolation  of  mutations  altering  or  eliminating  their 
expression.    The  fact  that  deletion  mutants  for  the  D.  melanogaster 
Hsp/O  genes  are  lethal  in  early  embryos  or  larvae  under  nonshock 
conditions  demonstrates  that  they  have  essential  functions  unrelated  to 
the  high  level  of  expression  observed  when  induced  (Ish-Horowicz,  et 
al.,  1977).    Other  searches  for  D.  melanogaster  heat-shock  expression 
mutants  have  resulted  in  isolation  of  mutations  in  unrelated  genes  that 
cause  synthesis  of  abnormal  proteins,  e.g.  actin,  which  induce  the 


normal  heat-shock  response  but  are  not  heat-shock  mutations  per  se 
(Hiromi  et  al.,  1986) (Parker-Thornburg  and  Bonner,  1987). 

True  heat-shock  gene  mutations  that  demonstrate  the  vital  function 
of  these  genes  have  been  successfully  isolated  in  Escherichia  coli  and 
yeast.    Deletions  reducing  expression  of  the  f.  coli  GroE  heat-shock 
genes  prevent  growth  at  normal  temperatures,  and  DnaK  (a  70  kDa-group 
protein)  is  necessary  for  heat-shock  tolerance  (Kusukawa  and  Yura, 

1988)  .    Similarly,  mutations  of  the  Saccharomyces  cerevissiae  heat-shock 
factor  which  regulates  heat-inducible  transcription  are  lethal  (Sorger 
and  Pel  ham,  1988). 

The  appearance  and  regulation  of  heat-shock  transcripts  and 
proteins  was  a  well -developed  area  long  before  the  function  of  heat- 
shock  proteins  was  opened  to  study  at  the  biochemical  level.  Early 
naive  suggestions  were  made  that  heat-shock  proteins  somehow  prevent  or 
protect  against  heat-induced  protein  denaturation  and  consequent  loss  of 
activity.    This  has  proven  to  be  very  close  to  the  functions 
demonstrated  by  evidence  collected  in  the  past  few  years.    HSP60  has  a 
role  in  mitochondria  in  maintaining  imported  proteins  in  a  translocation 
and  assembly-competent  form  (Cheng  et  a/.  1989) (Ostermann  et  al., 

1989)  (Hartl  and  Neupert,  1990).    Cell  export  of  proteins  in  E.  coli  is 
facilitated  by  HSP70-  and  HSP60-like  proteins,  presumably  due  to 
facilitated  folding  (Phillips  and  Silhavy,  1990).    A  protein  similar  to 
HSP70  is  involved  in  transport  competence  of  proteins  destined  for 
degradation  in  rat  lysosomes  (Chiang  et  al.,  1989). 

A  related  but  more  universal  function  is  indicated  by  studies  of 
the  association  of  HSP70  with  newly-synthesized  proteins  by  Beckmann  et 
al.  (1990).    If  these  authors  are  correct,  all  protein  folding  may  occur 
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not  simply  as  a  consequence  of  its  primary  amino  acid  sequence,  but  as  a 
result  of  an  intimate  association  with  this  heat-shock  protein.    Due  to 
repeated  observations  of  facilitated  trafficking  and  folding  as  a  result 
of  association  with  heat-shock  proteins,  the  heat-shock  proteins  are 
considered  "molecular  chaperons"  or  "chaperonins. " 

Given  the  deleterious  effects  of  heat-shock  mutations  on  normal 
and  heat-stress  growth,  and  their  recently  discovered  importance  in 
protein  folding  and  translocation,  it  is  not  surprising  that  the 
appearance  of  heat-shock  proteins  and  increases  in  thermotolerance  are 
correlated.    Stephanou  et  a7.  (1983a)  positively  correlated  heat-shock 
protein  synthesis  with  increased  survival  of  D.  mel anogaster  that  had 
been  genetically  selected  for  heat  tolerance.    The  mediterranean 
fruitfly,  Ceratitis  capitata,  showed  increased  survival  at  normally- 
lethal  temperatures  if  heat-shock  protein  synthesis  was  induced  by  a 
sublethal  heat  shock  prior  to  high  temperature  exposure  (Stephanou  et 
al.,  1983b) (Stephanou,  1987).    Similarly  in  yeast,  exposure  of  cultures 
to  elevated  temperatures  before  a  usually-lethal  exposure  increased 
survival  and  was  correlated  with  increased  synthesis  of  heat-shock 
proteins  (McAlister  and  Finkel stein,  1980). 

Heat-Shock  in  Non-Drosophi1 ids 

Heat  shock  has  been  investigated  very  little  in  insects  besides 
Drosophila  spp.    Many  studies  have  been  done  in  Chironomus  (e.g.  Vincent 
and  Tanguay  (1979),  and  Barettino  et  a7.  (1982)),  and  a  few  in 
Sarcophaga  bullata  (Bultmann,  1986a) (Bultmann,  1986b). 


studies  of  heat-shock-related  phenomena  in  mosquitos  are  primarily 
of  hybrid  gene  expression  controlled  by  D.  melanogaster  Hsp70  promoters 
in  Aedes  albopictus  cell  cultures  (Berger  et  al.,  1985)(Durbin  and 
Fallon,  1985) (Fallon,  1986) (Gerenday  et  a/.,  1989)  or  in  one  case  in 
genetically  transformed  Anopheles  gambiae  (Miller  et  a/.,  1987). 
Endogenous  mosquito  heat-shock  proteins  have  been  studied  only  in 
A.  albopictus  cell  cultures  (Carvalho  and  Rebello,  1987) (Carvalho  and 
Freitas,  1988) (Gerenday  et  a/.,  1989)(Tatem  and  Stollar,  1989). 

Narang  et  al .  (1985)  determined  the  restriction  pattern  of 
Anopheles  albimanus  Wiedemann  genomic  digests  probed  with 
D.  melanogaster  Hsp70  clones.    Beside  this  investigation,  no  insect 
heat-shock  genes  outside  of  Drosophila  spp.  have  been  studied  at  the 
level  of  gene  organization,  nucleic  acid  or  protein  sequence. 

As  a  first  step  to  understanding  the  function,  structure  and 
expression  of  heat-shock  genes  of  the  malaria  vector  A.  albimanus,  I 
have  undertaken  experiments  regarding  three  areas  of  heat  shock  related 
to  mosquito  biology  and  genetic  manipulation:    the  effect  of  heat  shock 
on  mosquito  survival,  the  structure  and  expression  of  the  Hsp70  and 
Hsp83  genes,  and  the  relationship  of  the  mosquito  heat-shock  genes  to 
those  of  Drosophila  spp.. 


CHAPTER  2 

HEAT-SHOCK  MORTALITY  AND  INDUCED  THERMOTOLERANCE 


'  '  Introduction 

Heat-induced  thermotolerance  has  been  observed  in  numerous 
Diptera,  e.g.  DrosophiTa  mel anogaster  (Alahiotis  and  Stephanou, 
1982)(Berger  and  Woodward,  1983) (Singh  and  Lakhotia,  1988),  Chironomus 
stn'atipennis  (Nath  and  Lakhotia,  1989),  and  Ceratitis  capitata 
(Stephanou  et  a/.,  1983b).    Generally  this  is  demonstrated  by  exposing 
insects  to  a  relatively  mild  heat  shock  before  exposure  to  temperatures 
in  the  lethal  range.    Alternatively,  insects  are  reared  at  various 
temperatures  before  the  lethal  exposure.    The  results  of  both  types  of 
experiments  are  consistent  with  increased  survival  as  a  consequence  of 
previous  exposure  to  elevated  temperatures. 

In  this  study,  similar  experiments  were  conducted  for  the  tropical 
mosquito  Anopheles  albimanus.    Specifically,  I  asked  at  what  temperature 
does  heat-induced  mortality  occur,  how  broad  is  the  range,  and  is  it 
affected  by  the  rearing  temperature  or  prior  exposure  to  sublethal  heat 
shock,  and  do  more  extreme  sublethal  heat  shocks  produce  greater 
thermotolerance? 
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Materials  and  Methods 

Heat  Shocks 

Heat-shock  mortality  in  relation  to  rearing  temperature.  A. 
aTbimanus  larvae  from  the  USDA-Insects  Affecting  Man  and  Animals 
Research  Laboratory  main  colony  were  reared  at  25.0  or  30.0°C  (+  0.5°C) 
from  egg  hatch  to  the  fourth  instar  on  a  diet  of  2  parts  of  TetraMin 
Baby-E  Fish  Food  (Tm)  to  1  part  brewers  yeast  (Benedict  et  a/.,  1979). 
One  hundred  mid  to  late  4th  stage  larvae  were  counted  into  each  of  six 
treatment  containers  consisting  of  100  ml  plastic  beakers,  the  bottom  of 
which  had  been  cut  off  and  replaced  with  fine  plastic  screen.  These 
were  transferred  to  identical  foam  ice  chests  containing  approximately  5 
liters  of  municipal  supply  water  adjusted  to  37.0,  38.5,  40.0,  41.5  and 
43.0°C  for  the  25.0°C  rearing  tests,  or  38.5,  40.0,  41.5  43.0°C,  and 
44.5°C  for  the  30.0°C  rearing  test.    Controls  for  the  heat-lethality 
tests  were  larvae  counted  and  handled  the  same  as  the  heat-treated 
larvae,  but  transferred  to  identical  chests  filled  with  25.0°C  or  30.0°C 
water,  depending  on  the  original  rearing  temperature.    The  temperature 
in  these  chests  was  maintained  within  0.5°C  by  stirring  the  water,  and 
adding  warm  water  every  five  to  ten  minutes.    Larvae  were  held  at  the 
different  temperatures  for  30  minutes  and  then  transferred  back  to  water 
at  25.0  or  30.0°C  for  30  minutes,  at  the  end  of  which  time,  dead  larvae 
were  counted.    These  experiments  were  repeated  three  times. 

Heat-shock  mortality  in  relation  to  a  brief  sublethal  heat  shock. 
All  larvae  for  these  tests  were  handled  the  same  as  those  above  and  were 
reared  at  25°C.    Sublethal  heat  shocks  were  administered  at  28.0,  33.0, 
or  37.0°C  for  30  minutes.    The  control  was  similar  but  transferred  to 
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25.0°C.    Each  of  these  temperature  groups  contained  five  groups  of  100 
larvae.    At  the  end  of  the  heat-treatment  period,  the  larvae  were 
transferred  back  to  25.0°C  for  30  minutes.    One  beaker  from  each  of  the 
groups  was  transferred  to  25.0,  40.0,  41.5,  43.0,  or  44.5°C.    They  were 
held  there  for  30  min.  and  then  once  again  transferred  back  to  25.0°C 
for  30  minutes,  at  the  end  of  which  time  dead  larvae  were  counted. 
These  experiments  were  repeated  three  times. 

Data  Analysis 

All  mortality  data  were  transformed  by  an  angular  transformation, 
the  arcsin  of  the  square  root  of  the  percent  mortality.    Analysis  of 
variance  (ANOVA)  and  Duncan's  Multiple  Range  Test  were  used  to  compare 
the  transformed  mortalities  using  the  SAS  procedure  ANOVA.    The  main 
effects  of  the  rearing  temperature  experiments  were  replicate,  rearing 
temperature,  and  lethal -range  temperature.    Only  the  temperatures  that 
were  used  to  treat  both  the  25.0  and  30.0°C  larvae  (38.5,  40.0,  41.5, 
and  43.0°C)  were  used  for  statistical  comparisons.    For  the  second  set 
of  experiments,  the  main  effects  considered  were  replicate,  preshock 
temperature,  and  lethal -range  temperature.    Transformed  mortalities  of 
larvae  reared  at  25.0°C  in  the  first  set  of  experiments  and  controls  in 
the  second  set  that  were  mock  preshocked  at  25.0°C  were  compared  using 
the  SAS  procedure  TTEST.    The  significance  level  for  all  statistics  was 
0.05. 
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Results  and  Discussion 

Heat-Shock  Mortality  in  Relation  to  Rearing  Temperature 

A  set  of  experiments  was  designed  to  determine  larval  mortality  at 
various  temperatures  and  whether  tolerance  to  these  temperatures  could 
be  increased  by  rearing  larvae  at  a  higher  temperature.    Two  rearing 
temperatures,  25.0  and  30.0°C,  were  chosen.    These  temperatures  promote 
high  survival  and  reasonable  development  times.    A  graphical 
representation  of  the  mortality  data  is  shown  in  Figure  2-1.    Table  2-1 
shows  the  raw  data,  and  Tables  2-2  and  2-3  show  the  ANOVA  statistics  and 
statistically  significant  subgroups  by  Duncan  analysis. 

Larvae  reared  at  25.0°C  were  killed  in  significantly  higher 
numbers  than  those  reared  at  30.0°C.    This  demonstrates  that  raising  the 
rearing  temperature  can  decrease  sensitivity  to  heat,  and  that  mortality 
occurs  in  a  very  narrow  temperature  range  from  40.0  to  43.0°C. 

Mortality  was  not  significantly  different  at  38.5°C  from  that  at 
40.0°C  (Table  2-3).    Significant  differences  occurred  between  the  first 
and  second  replicates  among  25.0°C-reared  larvae  (replicates  1-3)  . 
Replicate  tests  of  larvae  reared  at  30°C  were  not  significantly 
different  (replicates  4-6).    Replicate  variation  is  probably  due  to 
uncontrolled  variables  affecting  the  very  narrow  response  range.  No 
control  mortality  occurred  in  these  or  the  following  experiments. 
Preshock- Induced  Heat-Tolerance  Experiments. 

The  second  set  of  experiments  determined  the  effect  of  a  brief 
sublethal  heat  shock  on  heat  sensitivity.    The  sublethal  shock 
temperatures  chosen  were  based  on  the  results  of  the  first  sets  of 
experiments  above.    No  mortality  had  been  observed  at  37.0°C,  so  extreme 
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shocks  were  administered  at  this  temperature,  moderate  shocks  at  33.0°C, 
and  mild  shocks  at  28.0°C.    The  raw  mortality  data  is  graphed  in  Figure 
2-2.    Table  2-4  shows  the  raw  data  and  Tables  2-5  and  2-6,  the  ANOVA 
statistics  and  Duncan  groupings. 

Larvae  preshocked  at  37.0°C  were  significantly  more  resistant  to 
heat-killing  than  those  shocked  at  28°C  or  controls.    The  28.0  and 
33.0°C  groups  show  a  trend  toward  decreased  sensitivity  which  might  have 
been  statistically  significant,  if  the  variability  had  been  lower 
(Tables  2-4  and  2-6). 

Larvae  were  reared  at  25.0°C  and  heat  treated  similarly  in  the 
first  and  second  experiments  allowing  comparison  of  the  mortality.  This 
comparison  would  allow  detection  of  significant  temporal  changes  in  the 
heat  sensitivity  of  the  larvae.    This  would  also  reveal  effects  of 
handling  differences  between  the  first  and  second  experiments.  Those 
possibilities  were  not  realized,  because  the  amount  of  heat-caused 
mortality  in  the  25.0°C  group  of  the  first  experiments,  and  the  25.0°C 
controls  of  the  second  set,  were  not  significantly  different  (T  test, 
Prob.  >  F'  =  0.94).    However,  once  again  in  the  second  set  of 
experiments,  statistically  significant  differences  were  observed  between 
the  first  and  third  replicates  probably  for  the  same  reasons  suggested 
above. 

These  experiments  show  that  not  only  does  rearing  larvae  at  higher 
temperatures  increase  their  resistance  to  heat-killing,  but  a  relatively 
brief  30  minute  exposure  to  37.0°C  also  increases  their  heat  resistance. 
Some  thermotolerance  can  probably  be  induced  by  the  slightest  of 
temperature  elevations,  however,  in  these  experiments,  the  resolution  is 
limited  by  replicate  variation  due  to  uncontrolled  variables. 
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These  observations  of  inducible  thermotolerance  after  a  brief 
shock,  or  as  a  result  of  different  rearing  temperatures,  are  similar  to 
those  made  in  D.  melanogaster  (Alahiotis  and  Stephanou,  1982)(Berger  and 
Woodward,  1983) (Singh  and  Lakhotia,  1988),  C.  striatipennis  (Nath  and 
Lakhotia,  1989),  and  C.  capitata  (Stephanou  et  a/.,  1983b).    The  lethal 
temperature  range  that  I  have  observed  is  also  similar  to  that  seen  in 
the  above  references,  although  direct  comparisons  are  difficult  to  make 
due  to  differences  in  the  treatment  methods  and  periods  after  treatment 
at  which  lethality  was  determined.    In  the  experiments  reported  here, 
additional  delayed  mortality  might  have  occurred  among  larvae  that  were 
counted  as  survivors. 

What  is  the  physiological  basis  for  thermotolerance?  Although 
thermotolerance  is  undoubtedly  complex,  increased  thermotolerance  is 
positively  correlated  with  increased  synthesis  of  heat-shock  proteins 
(reviewed  by  Craig,  1985).    For  example,  D.  melanogaster  genetic  strains 
have  been  selected  for  cold  or  warm  rearing  conditions  (Stephanou  et 
a/.,  1983a)(Alahiotis  and  Stephanou,  1982).    The  cold-selected  strain  is 
more  sensitive  than  the  warm-selected  to  heat-killing  when  reared 
similarly.    The  sensitive  strain  produces  lower  levels  of  heat-shock 
proteins  than  does  the  insensitive.    Other  experiments  have  shown  that 
this  genetic  effect  can  be  simulated  merely  by  rearing  the  same  strain 
at  two  temperatures  (Singh  and  Lakhotia,  1988).    Cold-reared  D. 
melanogaster  are  more  sensitive  to  heat  killing  than  those  reared 
warmer.    Similarly,  C.  capitata  that  have  been  preshocked,  have  higher 
levels  of  heat-shock  proteins  and  thermotolerance  than  larvae  not 
shocked  (Stephanou  et  a/.,  1983b).    The  onset  of  thermotolerance  in 
D.  melanogaster  embryos  occurs  at  gastrulation,  the  same  stage  at  which 
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they  are  able  to  synthesize  heat-shock  proteins  (Bergh  and  Arking, 

1984)  .    Finally,  when  D.  melanogaster  cells  are  treated  with  ecdysone, 
which  is  known  to  induce  synthesis  of  the  small  heat-shock  proteins, 
thermotolerance  increases  (Berger  and  Woodward,  1983) (Berger,  1984). 
Likewise,  immature  stages  of  whole  animals  have  greater  thermotolerance 
during  periods  of  higher  ecdysone  titer. 

These  experiments,  though  suggestive,  are  not  as  conclusive  as 
data  from  Escherichia  coli  and  yeast  showing  that  deletion  mutants  for 
heat-shock  protein  genes  are  unable  to  grow  at  elevated,  or  sometimes 
even  normal  temperatures,  but  can  grow  at  reduced  temperatures  (Craig, 

1985)  . 

Seasonal  variation  in  the  levels  of  heat-shock  protein  synthesis 
probably  occurs  in  mosquitos.    This  has  been  observed  in  natural 
populations  of  C.  striatipennis  (Nath  and  Lakhotia,  1989).  These  authors 
observed  seasonal  and  temperature-related  variation  in  chromosome 
puffing  and  in  heat-shock  protein  inducibil ity.    Heat-shock  induction  of 
puffs  and  proteins  was  greater  in  larvae  that  were  laboratory-reared  at 
a  constant  temperature  than  in  those  that  had  been  exposed  to  warmer 
natural  conditions  and  were  already  synthesizing  heat-shock  proteins. 
The  field-collected  larvae  were  also  less  sensitive  to  heat  killing  than 
the  constant-temperature  laboratory  reared  larvae. 

Thermoprotection  is  a  common  necessity  for  mosquitos,  particularly 
tropical  larvae  breeding  in  exposed  sites  where  daily  and  seasonal 
temperature  fluctuations  occur.    These  experiments  demonstrate  that 
mechanisms  exist  in  A.  albimanus  to  increase  its  survival  under  variable 
temperature  conditions. 
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Table  2-1. 
Number  of  Larvae  Killed  in 
Rearing-Temperature  Experiments 


Treatment  Temperature 

(°  c,  ± 

0.5) 

Rearing  Temp. 

Rep. 

25.0 

■3/  .yj 

oO .  0 

41.5 

43.0 

44.5 

25.0 

1 

0 

nu 

n 
u 

1 
1 

q 

93 

100 

ND 

25.0 

2 

0 

ND 

0 

0 

5 

39 

100 

ND 

25.0 

3 

0 

ND 

0 

0 

10 

80 

100 

ND 

average 

0 

0 

0.3 

8 

70.7 

100 

30.0 

4 

ND 

0 

0 

0 

0 

9 

96 

100 

30.0 

5 

ND 

0 

ND 

0 

0 

15 

100 

100 

30.0 

6 

ND 

0 

ND 

0 

0 

2 

100 

100 

average 

0 

0 

0 

0 

8.7 

98.7 

100 

"  Only  data  under  the  underlined  temperatures  were  used  for  ANOVA. 
^  ND  indicates  experiment  not  done. 


Table  2-2. 
ANOVA  Statistics  for 
Rearing-Temperature  Experiments 


Source  Df  SS  F  value  Prob.  >  F 

Model  9  31,615  79.59  0.0001 

Error  14  618 
=  0.98 

Dependent  Variables  Df  SS  F  value  Prob.  >  F 

Lethal  Temperature  3  28,183  212.59  0.0001 

Replicate  5  1,894  378.77  0.0007 

Rearing  Temperature  1  1,575  35.65  0.0001 


Table  2-3. 

Duncan's  Multiple  Range  Test  Grouping  for 
Rearing  Temperature  Experiments 


Effect 


Lethal  Temperature 

38.5 

40.0    41.5  43.0 

Grouping^ 

1 

—  "11  — 1 

Repl icate 

3b       2"       5'       6=  4° 

Grouping 

1— -- 

,:!-!T-Ti  -, 

Rearing  Temperature 

25.0 

30.0 

Grouping 

I-— 1 

1— -1 

Continuous  bars  join  variables  with  the  same  Duncan  grouping. 
^  These  groups  reared  at  25°C. 
"  These  replicates  reared  at  30°C. 


Table  2-4. 
Number  of  Larvae  Killed  in 
Preshock- Induced  Heat-Tolerance  Experiments 


Treatment  Temp.  (°C,  +  0.5) 


Preshock  Temp. 

Repl icate 

25.0 

40.0 

41.5 

43.0 

25.0 

1 

0 

12 

25 

100 

25.0 

2 

0 

0 

68 

95 

25.0 

3 

0 

0 

8 

90 

average 

0 

4 

33.7 

95 

28.0 

1 

0 

22 

14 

100 

28.0 

2 

ND^ 

3 

23 

93 

28.0 

3 

0 

0 

12 

94 

average 

0 

ft  o 

8.3 

16.3 

95.7 

33.0 

1 

0 

4 

8 

96 

33.0 

2 

0 

1 

13 

92 

33.0 

3 

0 

0 

5 

90 

average 

0 

1.7 

8.7 

92.7 

37.0 

1 

0 

2 

3 

81 

37.0 

2 

0 

0 

7 

83 

37.0 

3 

0 

0 

6 

84 

average 

0 

0.7 

5.3 

82.7 

'  Not  Done 


Table  2-5. 
ANOVA  Statistics  for 
Preshock- Induced  Heat-Tolerance  Experiments 


Source  Df  SS  F  value  Prob.  >  F 

Model  7  32,198       73.88  0.0001 

Error  28  1,743 
=  0.95 

Dependent  Variables  Df  SS  F  value  Prob.  >  F 

Lethal  Temperature  2  30,599  245.73  0.0001 

Replicate  2  615  307.86  0.0145 

Preshock  Temperature  3  983  327.70  0.0053 


Table  2-6. 

Duncan's  Multiple  Range  Test  Grouping  for 
Preshock- Induced  Heat-Tolerance  Experiments 


Effect 


Lethal  Temperature 

40.0 

41.5 

43.0 

Grouping^ 

1— -1 

1— -1 

1— -1 

Repl icate 

1 

2 

3 

Grouping 

1 

--I 

Preshock  Temperature 

25.0 

28.0 

33.0  37.0 

Grouping 

I-  — - 

*  Continuous  bars  join  groups  with  the  same  Duncan  grouping. 


CHAPTER  3 

ORGANIZATION,  LOCATION,  AND  EXPRESSION 
OF  THE  HSP70  AND  HSP83  HEAT-SHOCK  GENES 
IN  THE  MOSQUITO  ANOPHELES  ALBIMANUS 

t 

.V  -  Introduction 

Numerous  strategies  have  been  presented  to  modify  and  control 
agriculturally  and  medically  important  insects  by  both  traditional  and 
molecular  genetic  means  (e.g.  Cockburn  et  a/.,  1984,  Kirschbaum,  1985). 
Molecular  efforts  to  modify  mosquitos  thus  far  have  concentrated  on 
methods  of  genetic  transformation  (Miller  et  a/.,  1987) (McGrane  et  a/., 

1988)  (Morris  et  a/.,  1989)  and  appropriate  transformation  markers,  e.g. 
(Berger  et  a/.,  1985)(Durbin  and  Fallon,  1985).    The  effort  of  isolating 
and  characterizing  novel  endogenous  promoters  to  drive  expression  of 
hybrid  genes  has  been  given  little  attention,  due  in  part  to  general 
success  expressing  genes  under  the  control  of  the  Drosophi'la 
melanogaster  Hsp70  promoter  in  other  animals:  monkey  (Pelham,  1982), 
mouse  (Corces  et  a/.,  1981),  and  sea  urchin  (McMahon  et  al.,  1984). 

To  date,  the  Drosophi'la  mel anogaster  Hsp70  promoter  has  been  used 
as  the  inducible  promoter  for  all  hybrid  gene  expression  in  mosquitos 
(Berger  et  a1.,  1985)(Durbin  and  Fallon,  1985) (Fallon,  1986) (Miller  et 
a/.,  1987)(McGrane  et  a/.,  1988)(Morris  et  a/.,  1989) (Gerenday  et  a/., 

1989)  .  The  Hsp70  promoter  has  effectively  increased  chloramphenicol 
acetyl  transferase  (CAT)  synthesis  30-fold  (Berger  et  al.,  1985)  and 
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G418  resistance  similarly  (Miller  et  al.,  1987),  although  in  the  latter 
case  low  level  constitutive  expression  was  observed.    These  levels  of 
induction  of  hybrid  genes  are  similar  to  those  observed  in  D. 
melanogaster. 

Although  the  D.  melanogaster  Hsp70  promoter  has  been  adequate  for 
hybrid  gene  expression  in  mosquitos,  endogenous  heat  shock  promoters 
might  possess  differences  that  would  make  them  superior.  These 
advantages  could  be  either  in  reduction  of  the  temperatures  required  to 
obtain  satisfactory  expression,  the  constitutive  level  of  expression,  or 
tissue  specificity.    Therefore,  it  is  reasonable  to  study  the  structure 
and  expression  of  endogenous  heat  shock  genes  to  find  distinguishing 
characteristics  that  suggest  their  use  as  an  alternative  in  genetic 
modification  of  insects.    Additionally,  features  that  are  shared  with  D. 
melanogaster  may  reveal  unrecognized  functional  elements  which  will  lead 
to  a  better  understanding  of  the  regulation  of  these  genes. 

Two  D.  melanogaster  genes,  Hsp70  and  Hsp83,  were  used  as  probes  to 
screen  a  genomic  library  of  the  mosquito  Anopheles  albimanus  Wiedemann 
for  homologous  genes.    These  mosquito  genes  were  characterized  and 
evaluated  to  investigate  the  potential  of  mosquito  heat-shock  promoters 
for  hybrid-gene  expression. 

Materials  and  Methods 

General  Molecular  Methods 

Gels  were  0.5-1.2%  agarose  (Sigma)  buffered  and  run  in  IX  TBE 
(0.089  M  Tris-borate,  0.089  M  boric  acid,  2  mM  EDTA)  at  <  5.5  v/cm. 
Fragments  were  sized  using  lambda  Hind  III  or  1  kilobase  pair  (kbp) 


ladder  fragment  standards  from  Bethesda  Research  Laboratories  (BRL). 

Plasmids  were  prepared  by  the  boiling  method  (Holmes  and  Quigley, 
1981)  or  by  a  modification  of  the  alkaline-lysis  method  of  Birnboim  and 
Doly  (1979)  and  cesium  chloride  purification.    The  method  of  Cockburn 
and  Seawright  (1988)  was  used  to  prepare  insect  genomic  DNA.  Standard 
methods  were  used  for  restriction  analysis  of  plasmid  and  genomic  DNA 
(Maniatis  et  a/.,  1982)  except  restriction  enzymes  were  used  in  excess 
of  the  manufacturers'  recommendations  (BRL).    RNA  and  DNA  were 
quantified  by  UV  AbSggo- 

Prior  to  hybridization,  nitrocellulose  filters  were  baked  for  1/2 
to  2  hr.  at  80°C  under  vacuum  and  prehybridized  in  5X  SSPE  (20X  SSPE  is 
3.6M  NaCl,  0.2M  NaHzPO,  pH  7.4,  20mM  EDTA  pH  7,4),  0.1%  SDS  and  1%  non- 
fat dry  milk  (NFDM).    The  heterologous  D.  mel anogaster  probes  were 
hybridized  to  nitrocellulose  lifts  of  mosquito-library  plaques  and 
Southern  transfers  at  65°  C  in  5X  SSPE,  0.1%  SDS,  and  1%  NFDM. 
Prehybridizations  and  hybridizations  with  homologous  probes  were 
performed  at  56°  C  in  5X  SSPE,  0.1%  SDS,  1%  NFDM  and  25%  formamide. 

All  films  used  for  autoradiography  were  Kodak  X-AR  (Tm)  with  Kodak 
X-Omatic  Regular  (Tm)  intensifying  screens.    The  Escherichia  coli  DH5- 
alpha  strain  was  the  host  for  all  plasmids.    Bacteria  were  grown  on 
Luria-Bertani  culture  medium  with  50  ug/ml  ampicillin  selection. 

Isolation  and  Subcloning  Mosquito  Heat-Shock  Genes 

A  partial -digest  Sau  3A  library  of  genomic  DNA  from  the  mosquito 
A.  albimanus  was  constructed  in  bacteriophage  EMBL3  (S.  Mitchell)  and 
cultured  in  a  P2  lysogen,  host  strain  P2392,  to  select  against  non- 
recombinant  phage.    This  library  was  screened  with  nick-translated  heat- 
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shock  probes  for  the  D.  mel anogaster  70  and  83  kDa  heat-shock-protein 
genes:  plasmid  probe  aDm2.13  contains  the  conserved  amino-terminal 
coding  region  of  the  D.  meTanogaster  Hsp70  gene  (Claudia  Sutton, 
personal  communication),  and  clone  pPW244  contains  the  entire  Hsp83  gene 
(Holmgren  et  al.,  1979).    Positively-hybridizing  clones  were  purified, 
and  preliminary  restriction  maps  were  constructed  by  analysis  of 
fragments  separated  by  agarose  gel  electrophoresis.  Restriction 
fragments  that  hybridized  to  the  D.  mel anogaster  clones  were  identified 
by  Southern  hybridization  (Maniatis  et  a/.,  1982),  subcloned  into  pucl9, 
and  restriction  mapped  to  higher  resolution. 

Transcript  Analysis 

Larval  mosquitos  were  reared  at  25°C  according  to  the  method  of 
Benedict  et  al .  (1979).    For  heat  shocks,  4th  stage  larvae  were 
transferred  to  100ml  plastic  beakers,  the  bottom  of  which  had  been 
replaced  with  fine  screen.    These  were  suspended  in  water  for  30  minutes 
in  all  experiments.    Water  baths  at  40°C  were  used  except  where  noted. 
Nonshocked  larvae  were  maintained  at  25°C. 

Total  RNA  was  prepared  from  4th  stage  larvae  by  a  guanidinium 
thiocyanate-phenol  method  (Chomczynski ,  1987).    Oligo-dT  cellulose  (BRL) 
was  used  to  isolate  polyadenylated  RNA  (Maniatis  et  al.,  1982).  For 
northern  analysis,  RNAs  and  Ikb  ladder  DNA  size  standards  were 
glyoxalated  and  run  on  1.0%  agarose  gels  in  0.01  M  NaPO^  buffer  at  <5.5 
V/cm  (Maniatis  et  a/.,  1982).    Prehybridizations  and  hybridizations  were 
at  55°C  in  5X  SSPE,  25%  deionized  formamide,  200ug/ml  salmon  sperm  DNA, 
0.1%  SDS,  and  5X  Denhardt's  reagent. 
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Dot  blots  were  used  to  quantify  the  relative  heat-shock  transcript 
levels  under  normal  and  heat-shock  conditions.    The  blots  consisted  of 
5ug  of  total  RNA  bound  to  nitrocellulose  filters,  and  probed  with  nick- 
translated  clone  p70a.l6.    To  determine  the  total  amount  of  RNA  in  each 
dot,  duplicate  dot  blots  were  probed  with  an  ^.  aTbimanus  ribosomal  DNA 
probe.    Exposed  autoradiograms  were  quantified  by  densitometry  using  a 
cutoff  of  0.02  absorbance  units/mm^  as  the  minimum  peak  value 
integrated.    This  value  was  also  used  to  calculate  relative  absorbances 
when  no  peak  was  detected. 

Before  subcloning  into  plasmids,  cDNA  probes  were  used  to 
determine  the  regions  of  the  lambda  clones  which  might  be  transcribed. 
cDNA  probes  were  prepared  by  annealing  oligo-dT  primers  (BRL)  to  total 
RNA  and  extending  the  primers  with  cloned  M-MLV  reverse  transcriptase 
(BRL).    Prehybridization,  hybridization  and  autoradiography  were  the 
same  as  for  Southern  analysis. 

Primer  extension  was  used  to  determine  the  transcription  start 
sites.    Oligonucleotides  based  on  putative  leader  sequences  were  5'  end- 
labeled  with  ^^P  (6000  Ci/mMol,  New  England  Nuclear,  NEN)  using  T4 
kinase  (BRL).    End-labeled  oligonucleotides  were  annealed  to 
complementary  RNAs  at  50°C  for  5  hr  in  IX  hybridization  buffer  (5X  is  2 
M  NaCl,  0.2  M  PIPES  (pH  6.5),  and  5  mM  EDTA),  and  extended  with  cloned 
M-MLV  reverse  transcriptase.    The  extension  products  were  resolved  on  6% 
denaturing  sequencing  gels.    Sequence  standards  for  the  extension 
experiments  were  chain-terminated  sequencing  reactions  of  the  mosquito 
Hsp70  clones  using  approximately  5  X  10^  dpm  of  the  same  end-labeled 
primers. 


In  situ  Hybridization  to  Polvtene  Chromosomes 

In  situ  hybridizations  to  polytene  chromosomes  were  performed  by  a 
method  (Johnson-Schl itz  and  Lim,  1987)  which  has  been  modified  for 
mosquito  polytene  chromosomes  by  S.E.  Mitchell  (personal  communication). 
Briefly,  salivary  chromosomes  from  middle  to  late  4th  stage  larvae  were 
dissected  in  45%  acetic  acid,  transferred  to  1:2:3  (lactic  acid:  water: 
glacial  acetic  acid  respectively)  and  squashed  under  siliconized 
coverslips.    The  slides  were  refrigerated  overnight  and  then  frozen  at 
-70°C.    While  still  frozen,  the  coverslips  were  removed  and  the  slides 
were  transferred  to  absolute  ethanol  at  -20°C,  and  allowed  to  warm  to 
room  temperature.    The  slides  were  then  dried,  acetyl ated,  denatured  in 
70  mM  NaOH  for  1  minute,  and  hybridized  to  nick-translated  biotinylated 
DNAs  (bio-21-dUTP,  Clontech)  labeled  according  to  the  nick-translation 
kit  recommendations  (BRL).    Hybridization  was  detected  with 
streptavidin/alkaline  phosphatase  (Clontech)  using  the  substrates  NBT 
and  BCIP.    Chromosomes  were  counter-stained  with  Giemsa  for  about  three 
minutes  and  observed  by  phase-contrast  microscopy.    Band  designations 
were  made  using  the  standard  polytene  chromosome  map  (Keppler  et  a1., 
1973). 

DNA  Sequencing 

The  subcloning  strategies  were  designed  to  eliminate  the  transfer 
of  deleted  fragments  to  new  vectors  and  to  allow  use  of  the  standard 
M13/pUC-universal  and  T7/T3-alpha-reverse  priming  sites  flanking  the 
multiple  cloning  site  of  pUC19. 

Deletions  for  sequencing  p70a  were  made  by  the  Kilo-sequencing 
method  of  Barnes  (Barnes,  1983)  as  follows:  supercoiled  plasmids  were 
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nicked  with  Dnase  I,  the  nick  widened  with  Exonuclease  III,  and  the 
resulting  single-stranded  DNA  digested  with  Bal  31.    The  unique  left- 
hand  Sa7  I  site  in  the  polyl inker  was  digested,  the  ends  filled  using 
cloned  Klenow  fragment  of  DNA  Polymerase  I  (Kl enow) (International 
Biotechnologies  Inc.,  IBI),  and  the  deletion-bearing  plasmids  religated. 
Additional  sequence  was  obtained  from  p70a.l6  with  a  custom  primer, 
CGTTGAAGTAGGCTG  (position  1621,  Figure  3-12)  based  on  its  sequence,  and 
to  extend  the  3'  ends  of  the  p70a  open  reading  frames  (ORFs)  with  two 
other  primers  based  on  their  sequence:  GCAGCCAAGGAT  (positions  596  and 
4498),  and  CGGAGAAGGAAGAGTACGAGCACCAAATGC  (positions  285  and  4790).  (All 
DNA  sequences  throughout  this  paper  will  be  shown  5-prime  (5')  to  3- 
prime  (3')). 

Subclone  p70a.dl  was  a  product  of  the  deletion  subcloning  for 
sequence  analysis  of  p70a  (Figure  3-3).  It  was  utilized  in  in  situ 
hybridizations  as  a  clone-specific  probe. 

Subclones  for  sequencing  the  other  A.  albimanus  Hsp70  clone  (p70b) 
were  made  by  cutting  the  plasmid  at  the  unique  Mlu  I  site  in  the  center 
of  the  insert,  digesting  bi-directionally  with  Exonuclease  III,  and 
removing  aliquots  at  90  second  intervals.    The  single-stranded  DNA  was 
then  digested  with  SI  nuclease.    The  remaining  right-hand  portion  of  the 
sequence  was  removed  by  digesting  the  unique  Hind  III  site  in  the 
multiple  cloning  site,  and  the  ends  were  filled  using  Klenow  before 
ligation  at  a  dilute  concentration.    Bacteria  were  transformed  by 
standard  methods  (Hanahan,  1983)  and  screened  for  appropriately  sized 
plasmids.    Additional  sequence  was  obtained  by  sequencing  the  parent 
plasmid  (p70b)  and  a  subcloned  fragment,  p70b.5. 
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Sequencing  reactions  were  performed  on  boiling-method  preparations 
of  2-5  ug  of  plasmid  DNA  prepared  from  2ml  overnight  grow-ups.  Plasmids 
were  alkali -denatured  for  primer  annealing.    Sequenase  version  1.0  or 
2.0  (Tm,  U.S.  Biochemical  Corp.)  was  used  for  chain-termination 
sequencing  (Sanger,  1977)  with  manufacturer-supplied  reaction  solutions 
and  procedures.    Reaction  products  were  labeled  with  ^^S  dATP 
(5000Ci/mMol ,  NEN)  in  buffers  containing  Mg^.    Alternatively  Mn^  was 
added  to  increase  readability  close  to  the  primer  for  primer  extension 
standards  (Tabor  and  Richardson,  1989).    Most  sequencing  reactions  were 
run  on  0.2-0.9  mm  wedge  gels  (4%  acrylamide  (19:1  linear  to  bis,  LKB), 
8M  Urea,  IX  TBE)  at  55°C,  1750  volts  on  an  LKB  Macrophor  (Tm) ,  Sequigen 
(Tm,    BioRad),  or  user-built  electrophoresis  unit.    All  gels  were  rinsed 
for  10  min.  in  10%  acetic  acid  before  drying  in  a  forced-air  oven  at 
80°C.    Gels  run  on  the  Macrophor  were  bonded  to  the  running  plate,  but 
others  were  transferred  to  filter  paper  for  drying  and  autoradiography. 
Sequence  analysis  was  done  on  the  Genetics  Computer  Group  (GCG)  Software 
Package  (Devereux,  1984)  version  6.1,  and  the  Multiple  Sequence  Editor 
(Massachusetts  Institute  of  Technology)  both  running  on  a  MicroVAX  II 
computer.    DNA  sequences  came  from  Genbank  version  60  (June,  1989)  or 
European  Molecular  Biology  Laboratory  version  19  (May,  1989)  databases. 

Dot  plot  comparisons  to  identify  sequences  shared  by  mosquitos  and 
D.  meTanogaster  were  done  using  the  computer  program  D3H0M  (Fristensky, 
1984).    For  mosquito/mosquito  comparisons,  the  homology  range  was  10 
bases  and  the  minimum  homology  displayed  was  60%.    For  mosquito/D. 
melanogaster  comparisons,  the  homology  range  was  three  and  the  minimum 
homology  displayed  was  80%.    In  both  cases  the  scale  factor  was  0.95. 
These  parameters  were  empirically  determined. 
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Heat  shock  element  (HSE)  consensus  sequences  were  identified  by 
computer  analysis  using  the  consensus  of  Pelham  (Pelham,  1982)  with 
equal  weights  for  all  positions,  the  weighting  scheme  and  definition  of 
Xaio  and  Lis  (1988),  and  the  frequencies  of  Amin  et  al .  (1988)  (Figure 
3-1).    I  wrote  a  BASIC  program  WEIGHTS  to  scan  and  assign  scores  to 
nucleotide  sequences  based  on  a  user-defined  weight  matrix  (Figure  3-2). 
The  program  uses  a  si iding-window  approach  similar  to  many  database- 
searching  and  dot  plot  programs.    In  this  case,  a  window  of  the  sequence 
in  question  is  compared  nucleotide-by-nucleotide  with  the  weight-matrix 
values  and  the  sum  becomes  the  score  of  that  window.    The  window  is  then 
advanced  to  the  next  position  one  nucleotide  down  and  the  process 
repeated.    The  score  obtained  represents  the  degree  of  match  to  the 
matrix  sequence  with  provision  for  unequally  weighting  each  position  in 
the  matrix  sequence.    The  program  is  suitable  for  scoring  any  ungapped 
sequence  of  up  to  50  nucleotides  for  which  nucleotide  frequency  data  is 
available  such  as  TATA  boxes  and  cap  sites  (Bucher  and  Trifonov,  1986), 
translation  start  sites  (Cavener,  1987)  and  polyadenylation  sites 
(Birnstiel,  1985).    The  program  also  calculates  the  average  score  and 
standard  deviation.    It  is  available  from  the  author  in  BASIC  language 
or  compiled. 

Dinucleotide  frequency  chi-square  analysis  was  performed  using  the 
procedure  FREQ  in  the  SAS  software  package. 
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Results  and  Discussion 

Isolation  and  Mapping  the  Mosquito  Hsp70  and  Hsp83  Genes 

Of  50,000  EMBL3  clones  screened,  two  hybridized  to  the  D. 
mel anogaster  Hsp70  probe  and  were  given  the  names  70a  and  70b.  The 
haploid  genome  size  of  anopheline  mosquitos  is  about  2  X  10®  bp  (Rao  and 
Rai,  1987)  and  the  average  insert  size  in  the  library  is  expected  to  be 
15,000  bp.    Therefore,  assuming  random  representation  of  sequences,  the 
probability  of  missing  a  single-copy  gene  in  this  screen  was  0.02.  The 
two  pUC19  subclones  p70a  and  p70b  (from  lamda  70a  and  70b  respectively) 
did  not  overlap  and  were  presumed  to  represent  different  loci.  The 
restriction  map  of  each  clone  has  an  axis  of  symmetry  (Figures  3-3  and 
3-4).    This  axis  in  p70a  is  flanked  by  2.5  kbp  of  sequence  with 
identical  six-base  restriction  sites  (as  determined  by  restriction 
mapping)  and  in  p70b  by  2.2  kbp  of  moderately-conserved  restriction 
sites.    As  the  sequence  data  presented  below  will  show,  each  clone 
contains  two  Wsp/O-similar  genes  in  this  palindrome;  p70a  has  two  entire 
genes,  and  p70b  has  one  entire  gene  and  80%  of  the  coding  region  of  a 
second  gene,  the  remainder  of  which  was  deleted  in  the  original  cloning. 

One  subclone  from  each  of  the  two  candidate  Hsp70  mosquito 
subclones  p70a  and  p70b  was  isolated  for  probing  genomic  Southerns.  The 
p70a  left-hand  central  1.2  kbp  Eco  Rl/Bgl  II  fragment  was  transferred  to 
the  Eco  Rl/Bam  Hl-digested  plasmid  vector  pIBI30  (IBI)  as  clone  p70a.l6 
(Figure  3-3).    Sequence  data  will  show  that  this  sequence  contains  0.7 
kbp  of  highly  conserved  coding  sequence  and  0.5  kbp  of  nonconserved 
upstream  leader  and  promoter  region.    The  central  0.7  kbp  Xba  I  fragment 
of  p70b  was  cloned  into  the  Xba  I  site  of  pIBI30  to  create  p70b.5 
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(Figure  3-4).  This  fragment  contains  only  regulatory  sequences  and  does 
not  cross-hybridize  with  p70a. 

Of  34,000  clones  screened  with  the  H5p83  gene  of  D.  mel anogaster 
(pPW244),  one  positive  clone  was  identified,  subcloned  into  pucl9  as 
p83a  and  mapped  (Figure  3-5).    The  probability  of  missing  a  single-copy 
gene  in  this  screen  was  0.10.    Two  regions  of  p83a  hybridize  both  to  the 
Drosophila  Hsp83  probe  and  to  radiolabeled  heat  shock  cDNA  (Figure  3-9). 
These  data  are  consistent  with  two  genes,  a  pseudogene(s) ,  or  a  single 
gene  containing  an  intron. 

One  subclone  of  p83a  was  isolated  to  probe  genomic  Southern 
transfers  by  cloning  a  1.7  kbp  Xba  l/Bgl  II  fragment  into  the  Xba  l/Bam 
HI  site  of  pIBI30.    This  clone  includes  regions  of  p83a  to  which  both 
the  D.  melanogaster  Hsp83  (data  not  shown)  and  mosquito  heat-shock  cDNA 
probes  hybridized  (Figures  3-5  and  3-9). 

Southern  Analysis  of  Moscuito  Hsd70  and  HsdSS  Genes 

In  order  to  determine  the  number  of  genes  homologous  to  each  of 
these  clones  and  to  reveal  potential  cloning  artifacts,  the  subclones 
were  used  as  probes  of  genomic  Southern  blots.    Additional  genes  would 
appear  as  unexpected  fragments,  and  cloning  artifacts  would  be  detected 
by  the  absence  of  fragments  predicted  from  the  restriction  digests  of 
the  cloned  probes. 

Filters  of  restricted  genomic  DNA  were  probed  with  the  p70b- 
specific  subclone  p70b.5  (the  axial  Xba  I  fragment  of  p70b)  (Figure  3- 
6).    A  single  major  band  appears  in  each  lane:  Nsi  I,  a  band  of  6.5  kbp, 
placing  a  second  Nsi  I  site  asymmetrically  just  outside  the  cloned 
portion  of  the  left  gene;  Hlu  I,  a  5.4  kbp  band,  indicating  another 
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Mlu  I  site  is  symmetrically  located  in  the  genome  in  the  uncloned 
portion  of  the  left  gene;  Eco  RI,  the  2.7  kbp  central  Eco  RI  fragment  of 
p70b;  and  Xba  I,  the  0.7  kbp  fragment  corresponding  to  the  probe  p70b.5. 
These  data  indicate  that  there  is  one  copy  of  the  p70b  sequence  in  the 
genome.        '  . 

Similar  digests  were  probed  with  p70a.l6  (Figure  3-6).  This 
subclone  hybridizes  to  the  coding  and  noncoding  regions  of  the  parent 
plasmid  p70a,  but  less  intensely  to  p70b  due  to  coding  and  noncoding 
sequence  divergence,  and  not  at  all  to  the  axial  Xba  I  fragment  of  p70b. 
These  digests  are  interpreted  as  follows:    Nsi  I,  the  6.5  kbp  band  is 
from  70b,  the  more  intense  4.4  kbp  band  is  the  predicted  central  Nsi  I 
fragment  of  p70a,  and  the  3.3  kbp  band  is  of  unknown  origin;  MTu  I,  the 
5.4  kbp  fragment  is  from  70b,  and  the  intense  7.0  kbp  fragment  is  from 
70a  (the  predominant  mosquito  allele  has  an  internal  Mlu  I  site  not 
present  in  p70a);  Xba  I,  one  faint  band  is  the  4.0  kbp  band  expected 
from  the  right  gene  of  p70b.7  and  either  the  6.0  kbp  fragment  or  the 
upper  band  in  the  complex  around  3.0  kbp  may  represent  the  left  gene. 
If  the  central  Xba  I  site  of  p70a  is  not  present  in  all  alleles,  then 
3.0  and  4.5  kbp  fragments  would  result  when  it  is  present,  and  a 
fragment  of  7.1  kbp  when  it  is  absent.    The  sixth  fragment  is 
unexplained.    Preliminary  Southern  analysis  of  individual  mosquitos 
shows  that  Xba  I  fragment  polymorphism  does  exist  (data  not  shown). 
Although  the  p70a.l6  hybridizations  are  confounded  by  cross- 
hybridization,  p70a  is  probably  a  single  copy  in  the  A.  albimanus  genome 
and  occurs  in  the  form  cloned  or  as  close  variants. 

The  genomic  DNA  fragments  bearing  Hsp70  genes  that  were  detected 
here  are  not  consistent  with  those  tentatively  identified  by  Narang  et 


35 

al.  (1985).    Specifically,  the  sizes  of  the  Hind  III  and  Eco  RI 
fragments  that  they  determined  by  probing  genomic  Southerns  with 
D.  melanogaster  probes  are  not  the  same  as  in  the  sequences  I  have 
studied.    This  may  be  due  to  mosquito  strain  differences. 

I  cannot  be  certain  that  other  //sp/O- related  genes  such  as  the 
Hsp68  (Holmgren  et  a/., 1979)  and  Hsc70  heat  shock  cognate  genes  (Ingolia 
and  Craig,  1982) (Craig  et  a/.,  1983)  are  not  present  since  the 
hybridization  conditions  used  were  stringent.    Data  presented  below  from 
in  situ  hybridizations  will  clarify  the  organization  of  the  Hsp70  and 
Hsp83  genes. 

Restricted  genomic  DNA  probed  with  p83a.l3  confirms  the  accuracy 
of  the  restriction  map  of  this  clone  (Figure  3-6).    p83a.l3  cross- 
hybridizes  with  the  2.2  kbp  Xba  l/Bgl  II  fragment  of  p83a.l  (data  not 
shown)  so  p83a  also  contains  two  regions  of  similar  sequence.  The 
digests  in  Figure  3-6  can  be  interpreted  as  follows:  Xba  I;  two  bands  of 
3.0  and  3.25  kbp  are  the  cross-hybridizing  central  Xba  I  fragment  and 
the  p83a.l3  fragment  that  extends  to  an  Xba  I  site  just  outside  the 
right-hand  plasmid  cloning  site;  Nsi  I,  the  cross-hybridizing  6.5  kbp 
left-hand  fragment  expected  and  a  second  5.8  kbp  fragment  extending  to 
an  Nsi  I  site  to  the  right  of  the  cloning  site;  Hind  III,  one  prominent 
band  of  6.3  kbp  is  present  so  equidistant  sites  must  be  located  on 
either  side  of  the  single  site  in  p83a,  or  a  second  very  large  fragment 
was  not  resolved  on  this  gel;  Bgl  II,  a  single  band  of  5.1  kbp  is  the 
expected  central  fragment  containing  both  the  cross-hybridizing  regions. 
These  data,  together  with  transcript  analysis  discussed  below,  reveal 
that  the  sequences  present  in  this  clone  are  palindromic  with  an  axis 
roughly  centered  on  the  single  Hind  III  site.    The  restriction-site 
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particular  sites  in  either  half. 

In  Situ  Hybridizations 

As  described  in  the  previous  section,  two  distinct  pairs  of  Hsp70 
clones  were  found  in  the  genomic  library.    However,  some  fragments  in 
the  genomic  Southern  hybridizations  could  not  clearly  be  identified  with 
the  clones  isolated,  so  the  exact  number  of  loci  containing  Hsp70  genes 
was  unclear. 

To  clarify  this  situation,  probes  were  selected  for  in  situ 
hybridizations  to  polytene  chromosomes  to  define  the  number  of  locations 
of  these  genes.    Clone  p70a.l6  was  expected  to  hybridize  to  all  Hsp70 
loci,  but  to  the  p70a  locus  most  intensely.    Two  clone-specific 
plasmids,  p70a.dl  and  p70b.5,  were  used  to  distinguish  their  respective 
loci . 

Hybridization  of  p70a.l6  to  the  polytene  chromosomes  occurred  at 
two  loci  on  the  right  arm  of  chromosome  2  in  most  complements;  a  strong 
signal  was  seen  in  the  proximal  bands  of  region  13C  and  a  weaker  signal 
in  the  proximal  band  of  IIC  (Figure  3-7).  Although  hybridization  could 
not  be  seen  clearly  in  both  bands  in  all  chromosome  complements  of  each 
individual,  both  loci  did  show  signal  in  most  complements. 

p70a  and  p70b  were  tentatively  assigned  to  13C  and  IIC 
respectively.  This  was  confirmed  by  subsequent  probing  of  the  salivary 
polytene  chromosomes  with  the  clone-specific  probes  p70a.dl  and  p70b.5. 
Clone  p70a.dl  hybridized  only  to  IIC  and  p70b.5  only  to  13C, 
conclusively  showing  that  clone  p70a  is  derived  from  locus  13C  and  p70b 
from  lie.    This  genomic  organization  of  two  pairs  of  genes  in  divergent 
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orientation  on  the  same  chromosome  arm  is  therefore  the  same  as  that 
observed  in  most  Drosophila  spp,  (Leigh-Brown  and  Ish-Horowicz,  1981). 

Clone  p83a  hybridized  to  a  single  locus,  the  distal  band  of  40A, 
on  the  left  arm  of  chromosome  3  (Figure  3-7).    A  single  site  of 
hybridization  is  consistent  with  the  Southern  hybridizations  and  shows 
the  A.  albimanus  Hsp83  gene(s)  is  located  at  a  unique  locus  as  in 
D.  melanogaster  (Holmgren  et  a7.,  1979). 

Transcript  Analysis 

Northern  analysis.    The  mosquito  Hsp70  and  Hsp83  clones  were 
isolated  in  this  study  by  an  approach  independent  of  transcription  or 
induction  under  heat  shock.    Therefore,  RNAs  were  analyzed  to  determine 
the  sizes  of  the  transcripts  which  hybridize  to  the  clones  isolated,  and 
their  relative  abundance  under  normal  and  heat-shock  conditions.  For 
this  purpose,  gel  electrophoresis  was  performed  for  northern  analysis. 

Northern  analysis  of  total  and  polyadenylated  RNA  shows  that  RNAs 
that  hybridize  to  the  p70a  and  p70b  probes  are  polyadenylated  and 
strongly  induced  upon  heat  shock  (Figure  3-8).    The  filters  of  total 
heat-shock  RNAs  probed  with  p70a,  p70b  or  p83a  give  size  estimates  of 
2.6  kb  for  the  Hsp70  transcripts  and  of  3.0  kb  for  the  Hsp83.  Sequence 
analysis  of  p70a  and  p70b  (see  below)  predicts  transcripts  of  2.1  bp 
before  polyadenylation  which  is  consistent  with  the  size  of  the  RNA 
detected  here.    These  sizes  are  also  consistent  with  RNAs  encoding 
proteins  of  70  and  83  kDa. 

cDNA-probinq  of  lambda  clones.    Southern  hybridization  of 
restricted  lambda  clones  probed  with  cDNAs  from  heat-shocked  larvae 
identified  sequences  that  hybridize  to  abundant  mRNAs.  These 
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preliminary  analyses  were  done  using  RNA  from  larvae  shocked  at  37°C  for 
30  minutes,  using  the  induction  temperature  that  is  optimal  for 
D.  melanogaster  Hsp70. 

When  restriction  digests  of  lambda  clones  70a  and  70b  were  probed 
with  cDNAs  made  from  either  nonshocked  or  heat-shocked  larvae,  fragments 
were  detected  that  are  consistent  with  the  regions  believed  to  be 
transcribed  based  on  sequence  analysis  (Figure  3-9).    This  also 
demonstrated  that  these  genes  are  transcribed  at  a  low  level  until  heat- 
shock  induction.    The  first  four  lanes  in  the  left  hand  panel  show  that 
the  9.4  kbp  Hind  Ill/Sal  I  fragments  of  70a  and  70b  contain  all  of  the 
sequences  complementary  to  abundant  RNAs  during  heat  shock.  Only 
faintly-hybridizing  bands  were  observed  when  digests  of  70a  or  70b  were 
probed  with  nonshock  cDNA.    The  Xho  l/Sal  I  digest  of  70a  yields 
hybridizing  fragments  of  2.5  kbp,  the  left-hand  fragment  which  extends 
slightly  into  the  downstream  end  of  the  coding  region,  and  a  larger  9.4 
kbp  fragment  which  is  the  downstream  end  of  the  right-hand  gene  and 
flanking  DNA.    The  failure  of  either  the  central  or  internal  Xho  I 
fragments  to  hybridize  indicates  that  the  polyA-primed  cDNA  extensions 
generally  terminated  before  these  regions  were  reached.    The  Xho  I 
digest  of  70a  gives  three  hybridizing  fragments,  some  of  which  appeared 
to  be  due  to  a  partial  digest  and  were  not  interpretable.    Only  one  70b 
fragment  is  detected  in  the  Xho  I  or  Xho  l/Sal  I  digests,  representing 
the  3'  end  of  the  right  gene.    No  fragment  from  the  left  gene  is 
detectable  since  only  the  3'  ends  of  the   genes  are  labeled,  and  this 
part  of  the  left  gene  was  not  cloned. 

Clone  83a  contains  sequences  that  hybridize  to  abundant  RNAs  in 
both  normal  and  heat-shocked  larvae  (Figure  3-9).     Only  one  hybridizing 


band  is  seen  in  the  Bam  Hl/Sa/  I  and  Sa/  I  digests:  the  Sal  I  fragment 
subcloned  into  p83a.    The  insert  contains  no  Bam  HI  sites.    In  the 
Hind  lU/Sal  I  digest,  the  5.3  and  3.5  kbp  fragments  are  the  left  and 
right  fragments  of  the  Sal  I  fragment  subcloned  into  p83a.    Two  major 
regions  of  83a  hybridize  to  the  cDNA  probes  (Figure  3-5)  confirming  that 
p83a  contains  two  regions  with  similar  sequences,  as  would  be  expected 
based  on  the  cross-hybridization  of  p83a.l3  to  the  other  p83a 
Xba  l/Bgl  II  fragment,  and  the  hybridization  pattern  of  the  Drosophila 
Hsp83  probe. 

The  filters  of  Figure  3-9  were  probed  with  aliquots  of  the  same 
cDNA  probe,  washed  similarly,  and  the  films  were  exposed  for  the  same 
amount  of  time.    Therefore,  comparisons  of  the  signals  suggest  that  the 
Hsp83  genes  are  normally  transcribed  at  moderately  high  levels  relative 
to  the  Hsp70  clones,  and  are  induced  only  slightly  at  37°C.    The  Hsp70 
clones  have  much  lower  levels  of  nonshock  transcription,  and  show  lower 
induction  at  37°C  than  Hsp83. 

Dot  blots  of  total  RNA.    In  order  to  determine  the  effect  of 
various  temperatures  on  Hsp70  transcript  levels,  dot-blot  analysis  was 
performed  by  hybridizing  one  of  the  mosquito  Hsp70  subclones,  p70a.l6, 
to  total  RNA. 

Maximal  expression  was  observed  at  40°C,  rather  than  at  37°C  as  in 
Drosophila  (Figure  3-10).    Heat-shock  transcript  levels  at  40°C  ranged 
from  15  to  335  times  higher  than  controls  (avg.  =  143),    The  temperature 
at  which  the  highest  transcript  levels  was  observed  is  similar  to  that 
of  Aedes  albopictus  (Gerenday,  1989)(Berger  et  a/.,  1985)  and  Plodia 
cells  in  culture  (Berger  et  a/.,  1985).    No  RNA  isolations  were  done 
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using  larvae  shocked  at  43°C  since  mortality  was  observed  at  that 
temperature. 

Once  I  had  determined  that  maximal  induction  occurred  at  40°C, 
experiments  were  conducted  to  determine  the  relative  transcript  levels 
over  time:  before,  during,  and  after  a  30  minute  heat  shock.    Dot  blots 
of  total  RNA  probed  with  p70a.l6  showed  that  transcripts  increase  140- 
to  520-fold  (average  =  320)  and  peak  within  15  minutes  of  heat  shock 
(Figure  3-10).    They  then  gradually  decrease,  but  transcripts  are  still 
easily  detectable  2V2  hours  after  the  shock  ends.    The  average 
transcript  level  at  30  minutes  is  275  times  that  of  controls,  which  is 
similar  to  the  induction  observed  in  the  temperature  dot-blot 
experiments  above.    Since  p70a.l6  hybridizes  to  both  pairs  of  Hsp70 
genes,  the  induction  measured  is  a  composite  of  RNAs  transcribed  from 
all  four  genes.    I  collected  no  data  that  clearly  indicate  the  relative 
contribution  of  the  four  genes  present.    However,  I  have  observed 
consistently  stronger  signals  from  p70a  in  northern  and  cDNA  analysis. 

The  results  of  the  primer-extension  experiments  will  be  discussed 
following  the  DNA  sequence  data. 

DNA  Sequence  of  p70a  and  D70b 

The  DNA  sequences  of  p70a  and  p70b  were  determined  for  the 
putative  coding  and  promoter  regions  of  both  pairs  of  genes,  except  for 
the  C-terminal  end  of  the  left  gene  of  p70b  which  was  truncated  in  the 
clone.    The  major  features  of  p70a  and  p70b  sequences  are  listed  in 
Table  3-1.     Each  plasmid  contains  two  large  divergently  oriented  open 
reading  frames  (ORFs)  (Figures  3-3,  3-4,  3-12  and  3-13).    The  right-  and 
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left-hand  ORFs  and  conserved  upstream  regions  will  be  referred  to  as  the 
right  and  left  genes. 

DNA  sequence  of  p70a  genes.    The  p70a  transcription  start  sites 
and  putative  translation  start  and  stop  codons  predict  open  reading 
frames  of  1923  bp  for  both  genes  and  mRNAs  with  untranslated  leaders  of 
222  and  231  bp  for  the  left  and  right  genes  respectively.    Only  515  bp 
separate  the  transcription  start  sites.    The  region  between  the  TATA 
boxes  is  slightly  A+T-rich  (55%)  though  not  as  greatly  so  as  the  spacer 
region  of  the  D.  melanogaster  Hsp70  genes  (Torok  and  Karch,  1980).  This 
is  close  to  the  average  composition  of  58%  A+T  for  A.  albimanus 
determined  by  A.  F.  Cockburn  (personal  communication)  and  indicates  that 
although  unusually  high  A+T  content  is  seen  in  the  D.  melanogaster  Hsp70 
spacer,  the  mosquito  spacer  has  average  composition.    There  is 
remarkable  sequence  similarity  between  the  two  genes  from  150  bp 
upstream  of  the  TATA  boxes  to  the  distal  ends  of  the  ORFs,  although 
there  are  insertion/deletions,  particularly  in  the  untranslated  leaders 
(Figure  3-14  and  3-15).    In  contrast  to  the  promoter  and  transcribed 
regions,  the  left  and  right  genes  have  no  obvious  sequence  conservation 
downstream  of  the  translation  stop  codons  (to  be  discussed  in 
Chapter  4). 

I  observed  26  nucleotide  differences  between  the  1923  bases  of  the 
protein-coding  regions  of  the  left  and  right  genes:  1,  0,  and  25  at  the 
first,  second  and  third  positions  of  codons  (18  transitions,  8 
transversions) .    The  predicted  amino  acid  sequences  differ  by  one 
conservative  substitution  at  residue  562;  aspartic  vs.  glutamic  acid. 
The  predicted  molecular  weights  of  the  left  and  right  gene-encoded 
proteins  are  70,251  and  70,237  Daltons  (Da). 


42 

Several  large  palindromic  regions  occur  in  the  spacer  region 
centered  around  bases  2394  and  2724.    These  consist  of,  or  are  adjacent 
to  the  heat-shock-element  (HSE)  arrays  just  upstream  of  the  TATA  boxes 
(discussed  further  below).    One  palindrome  of  23  bases  surrounds  the 
Bgl  II  site  at  2516  off  the  central  axis  toward  the  left  gene. 

Primer  extension. 

DNA  sequence  information  alone  is  of  relatively  little  value  for 
identifying  regulatory  regions  unless  the  transcribed  regions  are  known 
precisely.    For  our  purposes  particularly,  the  5'  end  of  the  transcripts 
should  be  mapped  since  sequences  necessary  for  heat  shock  induction  are 
found  upstream  to,  and  in  this  region.    Primer  extension  involves 
annealing  a  radiolabeled  DNA  primer  to  RNA  which  provides  an  initiation 
site  for  cDNA  synthesis  from  3'  to  5'  along  the  RNA.    The  RNA  serves  as 
a  template  for  this  enzyme-directed  synthesis  until  the  end  5'  end  is 
reached.    The  resulting  cDNA  fragment  is  analyzed  on  sequencing  gels 
using  DNA  sequencing  reactions  as  sequence  and  size  standards. 

Primer-extension  experiments  on  total  RNAs  were  used  to  map  the 
transcription  start  sites  of  the  Hsp70  genes.    Minor  sequence  divergence 
in  p70a  allowed  synthesis  of  right-  and  left-gene-specific  20-mers  with 
three  mismatches:  TCTGATACACTGATTACTTA  and  TCTAATGCACTGATTACTTG 
(positions  2934  and  2161  respectively,  Figure  3-12).    The  specificity  of 
these  was  confirmed  by  using  them  as  primers  to  sequence  p70a  which 
contains  both  annealing  sites.    Since  no  leader-sequence  differences 
downstream  of  the  suspected  transcription  initiation  site  were  available 
to  distinguish  the  right  from  the  left  genes  of  p70b,  a  synthetic  25-mer 


43 

primer  (TTATACGCTTTCTGATGCAACAATT)  was  used  to  map  the  transcripts  of 
both  (positions  1573  and  2639,  Figure  3-13). 

Primer-extension  experiments  mapped  the  transcription  start  sites 
of  p70a  to  bases  2290  and  2805  (Figure  3-16),  31  bases  from  the  first 
bases  of  the  TATA  boxes.    Identification  of  these  start  sites  accords 
well  with  predictions  based  on  the  D.  melanogaster  start  sites  and 
typical  distances  from  the  TATA  box  (Bucher  and  Trifonov,  1986).  No 
bands  were  observed  in  the  nonshock  RNA  control  lanes. 

The  primer  chosen  for  p70b  hybridizes  to  both  genes  of  that  clone, 
so  the  products  of  this  experiment  could  have  originated  from  one  or 
both  of  the  genes.    However,  since  the  sequence  of  the  pair  is  the  same 
for  about  250  bp  flanking  this  site,  it  is  likely  that  they  are 
transcribed  similarly.    Two  pairs  of  bands  were  observed.    The  more 
prominent  ones  correspond  to  initiation  sites  at  bases  1687  and  1690  for 
the  left  gene,  and  at  2522  and  2525  for  the  right  gene.    Since  the 
putative  TATA  box  is  repetitive  (TATATAAA  in  Fig  3-17),  and  a  tandem 
repeat  of  GTCGTC  is  found  at  the  transcription  start  (Figure  3-15), 
transcription  may  initiate  in  both  positions,  within  29  or  32  bp  from 
the  TATA  box.    This  determination  again  fits  well  with  comparison  of 
similar  sequences  of  D.  melanogaster  and  TATA  box  predictions. 

An  unusual  seouence  observed  in  the  D70a  untranslated  leaders. 
The  untranslated  leader  sequences  contain  a  curious  sequence  of  51  bases 
completely  devoid  of  thymidines  beginning  4  bases  after  the 
transcription  start  sites  (Figure  3-17).    The  major  recognizable  pattern 
is  seven  tandem  repeats  of  CAAG  which  can  be  generalized  to  C-A/G-A-G/A. 
The  same  pattern  is  found  to  a  limited  extent  in  a  similar  location  in 
three  D.  melanogaster  genes  known  to  be  preferentially  transcribed  and 
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translated  during  heat  shock:  Hsp70  (McGarry  and  Lindquist,  1985),  Hsp22 
(Hultmark  et  a/.,  1986),  and  other  D.  melanogaster  heat  shock  genes 
(Figure  3-17).    It  is  not  found  in  other  insect  genes  or  the  Hsp83  gene 
which  are  not  efficiently  transcribed  and  preferentially  translated 
under  heat  shock  (compiled  by  Hultmark  et  al .  (1986)). 

Could  this  motif  represent  the  DNA  sequence  responsible  for 
preferential  translation,  and  to  a  lesser  extent  efficient  transcription 
of  these  genes  during  heat  shock?    Undefined  sequences  in  the  first  30 
bases  of  the  leader  are  known  to  be  necessary  for  efficient  heat-shock 
transcription  and  preferential  translation  of  Hsp22  (Hultmark  et  a/., 
1986),  and  Hsp70  (McGarry  and  Lindquist,  1985)  and  are  sufficient  to 
confer  this  quality  on  D.  melanogaster  YPl  (Kraus  et  al.,  1988)  and  Adh 
transcripts  (Klemenz  et  a/.  1985).    However,  only  a  very  loose  consensus 
sequence  has  been  identified. 

What  mechanisms  might  account  for  the  supposed  transcriptional  and 
translational  functions  of  this  sequence?    In  D.  melanogaster  Hsp70 
genes,  RNA  Polymerase  II  (Pol  II)  is  known  to  be  transcriptionally 
engaged  near  the  5'  end  of  the  RNA  in  nonshocked  cells  with  an 
approximately  25-base  nascent  mRNA  synthesized,  but  elongation  is 
prevented  by  some  unknown  mechanism  until  heat  shock  occurs  (Peri sic  et 
al.,  1989).    Perhaps  some  transcription  factor  binds  to  a  conserved 
sequence  in  this  region  to  regulate  transcription.  Alternatively, 
capping  of  mRNA  may  regulate  translatabil ity  and  transcript  stability 
differently  under  heat  shock  and  normal  conditions.    Maroto  and  Sierra 
(Maroto  and  Sierra,  1988)  have  shown  that  cap  analogues  inhibit  the 
translation  of  normal  D.  melanogaster  mRNAs  and  Hsp83  transcripts  but 
not  other  heat-shock  transcripts.    Sequences  near  the  start  of  heat- 


shock  transcripts  may  either  interfere  with  normal  capping  or  bind 
factors  which  allow  preferential  translation  of  heat-shock  transcripts 
during  shock. 

The  CAAG  sequence  is  absent  from  the  p70b  transcription  start 
sites.    It  is  also  not  present  in  D.  meTanogaster  Hsp70  clone  B8 
(Ingolia  et  al . ,  1980).    Perhaps  greater  variation  exists  in  the 
expression  of  different  copies  of  Hsp70  genes  than  has  previously  been 
supposed.    The  sequence  variation  in  this  region  might  confer 
differential  expression  controls  upon  the  Hsp70  genes  which  expands  the 
repertoire  of  stress  response,  and  the  presence  or  absence  of  repeats  of 
the  above  sequence  motif  may  be  responsible. 

DNA  sequence  of  p70b  genes.    Clone  p70b  contains  two  ORFs  of  1506 
and  1923  bp,  in  divergent  orientation  like  those  of  p70a  (Figures  3-4 
and  3-13).    The  cloning  site  truncates  the  left  gene  at  amino  acid  502, 
but  the  right  gene  is  complete,  and  would  encode  a  peptide  of  641  amino 
acids  and  molecular  weight  70,153  Da.    The  TATA  boxes  are  separated  by  a 
spacer  which  is  60%  A+T. 

Comparison  of  the  sequences  of  the  two  genes  revealed  a  single- 
base  deletion  in  the  right  gene.    This  deletion,  which  was  unequivocally 
identified  in  two  independently  obtained  deletion  subclones  of  p70b  at 
position  3419,  would  alter  the  translation  frame  and  cause  termination 
at  codon  262.    This  deletion  might  exist  only  in  the  parent  clones,  or 
it  may  be  the  native  genomic  form.    Since  no  other  features  of  this 
sequence  suggest  it  is  a  pseudogene,  I  have  tentatively  inserted  an  "N" 
into  the  deletion  to  restore  the  reading  frame  for  sequence  and 
evolutionary  analysis.    The  "N"  is  inserted  in  the  third  position  of  a 
codon  and  does  not  affect  the  predicted  amino  acid. 


A  cloning  artifact  was  identified  by  sequencing  through  the  left- 
hand  Sal  I  subcloning  site.  Rather  than  the  desired  Sal  I  insertion  on 
the  left  end,  the  insert  was  added  3'  to  the  Sph  I  5'  "G"  in  the  pUC19 
multiple  cloning  site  (MCS),  thus  preserving  the  MCS  Pst  I  site.  It  is 
highly  improbable  that  the  multiple  cloning  site  would  have  been 
recreated  by  the  insert  so  a  cloning  artifact  is  almost  certain.  All  of 
the  sequences  involved  in  this  artifact  are  vector  sequences  and  do  not 
affect  conclusions  about  the  Hsp70  insert  sequence. 

The  right  and  left  transcription  start  sites  are  separated  by  831 
bp  which  is  316  bp  more  than  in  p70a.    The  length  of  the  nontranslated 
leader  of  both  genes  is  either  181  or  184  bp  depending  on  the 
transcription  start  site  used  (discussed  below). 

The  divergent  genes  of  p70b  are  more  similar  to  one  another  than 
the  p70a  pair.    The  untranslated  leaders  are  identical  and  the  promoters 
have  more  extensive  regions  of  sequence  similarity  upstream  of  the  TATA 
box;  about  250  bp  rather  than  150  bp  as  in  p70a  (Figure  3-14).    In  the 
1506  bp  of  protein-coding  DNA  compared  between  the  left  and  right  genes, 
there  are  only  two  differences  out  of  1506:  both  are  third  position 
changes  and  are  silent. 

Sequence  comparisons  between  D70a,  p70b  and  D.  mel anog aster  Hsp70 
genes.    The  p70a  and  p70b  genes  share  conserved  sequences  with  each 
other  and  with  the  D.  melanogaster  Hsp70  genes  at  the  translation  start 
site  (Figure  3-15).    Similar  sequences  are  found  in  most  eukaryotic 
genes  (Cavener,  1987).    The  protein-coding  regions  are  similar  as  are 
the  sequences  at,  and  immediately  upstream  of  the  TATA  boxes.  However, 
the  untranslated  leaders  are  very  dissimilar. 
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A  codon  usage  table  was  generated  for  the  A.  albi'manus  Hsp70  genes 
and  listed  parallel  to  usage  for  the  D.  meTanogaster  Hsp70  (Table  3-2). 
Visual  inspection  indicates  differences  between  A.  albimanus  and  D. 
mel anogaster  usage  for  threonine,  serine,  leucine,  and  proline.  This 
information  will  serve  as  an  aid  to  codon  selection  for  future  mosquito 
synthetic  gene  construction. 

A  dinucleotide  frequency  table  was  generated  for  the  mosquito 
protein-coding  regions,  putative  untranslated  mRNA-encoding  leader 
sequences,  and  spacers  between  TATA  boxes  (Table  3-3).  The 
distributions  all  deviated  significantly  from  expected  values  (chi- 
square  test,  P  <  0.001,  9  degrees  of  freedom).    GG,  CC,  and  TA  pairs 
were  consistently  under-represented;  GA  and  TC  were  over-represented, 
the  latter  especially  so  in  coding  regions.    Inconsistencies  between 
dinucleotide  frequencies  of  different  types  of  sequences  compared  were 
observed  for  GT  pairs  which  are  frequent  in  the  spacer  and  coding 
regions  but  relatively  infrequent  in  both  leaders. 

Protein-binding  CT  DNA  secuences  of  D70a  and  D70b.    Gilmour  et  al . 
(1989)  have  identified  regions  of  the  Hsp70,  Hsp26,  and  His3  {D. 
mel anogaster  Histone-3)  promoters  that  bind  a  protein  that  is  supposed 
to  have  a  role  in  assembling  and  maintaining  transcriptional  complexes 
in  transcriptional  preparedness.    This  protein  binds  to  regions  of 
alternating  C  and  T  within  approximately  200  bases  upstream  of  the  TATA 
box.    A  similar  if  not  identical  protein  binds  to  the  partially 
complementary  sequence  C/A-G-A-G-A-G-A-G-C  in  the  D.  mel anogaster 
Ultrabithorax  promoter  (Biggin  and  Tjian,  1988).    In  the  p70a 
interstitial  region,  no  extensive  CT  repeats  occur  although  three  small 
regions  around  2350,  2477,  and  2527  are  similar.    However,  the  sequences 
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of  the  D.  melanogaster  protein-binding  regions  are  so  variable  that  one 
cannot  rule  out  functionally  equivalent  sequences  in  this  clone. 
Sequences  resembling  the  CT  protein-binding  regions  do  occur  in  p70b 
upstream  of  the  TATA  box  around  1910,  extensively  at  2300,  and  also  at 
2450  and  2390.    However,  the  significance  of  these,  if  any,  is  unknown. 

Promoters  of  the  p70a  and  p70b  genes.    Heat-shock  elements  are 
essential  regulatory  sequences  found  upstream  of  the  transcription  start 
sites  of  heat-inducible  genes  (reviewed  by  Pelham  (1985)).    Two  sets  of 
HSE  within  100  bases  of  the  TATA  box  are  necessary  for  heat 
inducibil ity.    Pelham  (1982)  originally  defined  the  HSE  as  the  14  base 
palindrome  CTNGAANNTTCNAG  by  deletion  analysis  of  hybrid  constructs  in 
monkey  COS  cells.    Xiao  and  Lis  (1988)  have  redefined  the  HSE  as 
overlapping  10  bp  sequences  of  NTTCNNGAAN.    This  work  was  corroborated 
by  Amin  et  al .  (1988)  using  Hsp70/LacZ  fusions  to  transfect  D. 
melanogaster  cells.    These  definitions,  though  arrived  at  by  different 
means,  all  overlap,  i.e.  are  circular  permutations. 

On  the  basis  of  these  definitions,  potential  HSEs  were  identified 
in  the  promoter  sequences  of  p70a  and  p70b  using  the  program  WEIGHTS. 
Figures  3-14  and  3-18  compare  the  mosquito  and  D.  melanogaster  promoters 
and  indicate  the  high-scoring  HSE-like  regions.    The  locations  of  the 
mosquito  HSEs  are  similar  to  that  observed  in  the  D.  melanogaster  Hsp70 
and  other  heat-shock  genes  (Pelham,  1985).    However,  the  mosquito  HSE 
are  more  numerous  and  match  the  consensus  more  closely  than  those  of  D. 
melanogaster.    Scanning  sequences  with  the  Xiao  and  Lis  matrix 
contributed  little  additional  information. 

Might  mosquito  Hsp70  promoters  be  superior  to  those  of  D. 
melanogaster  for  the  expression  of  hybrid  genes  in  mosquitos? 


Drosophila  Hsp70  promoters  are  clearly  inducible  in  whole  mosquitos 
(Miller  et  a/.,  1987)  and  mosquito  cell  cultures  (Durbin  and  Fallon, 
1985)  (Berger  et  a?.,  1985),  but  mosquito  promoters  are  potentially 
superior  in  several  ways.    First,  the  mosquito  Hsp70  promoters  might  be 
induced  to  a  higher  rate  of  transcription  than  the  commonly  used  D. 
melanogaster  Hsp70  promoter.    This  would  make  it  possible  to  obtain 
better  discrimination  with  genetic  markers,  and  achieve  higher  recovery 
of  transformed  individuals.    Additionally,  adequate  transcript  levels 
might  be  obtained  by  shocking  with  lower  temperatures  so  that  less 
stress  would  result  to  the  insect.    My  analysis  of  mosquito  heat  shock 
promoter  sequences  suggests  they  may  indeed  be  stronger  promoters  based 
on  sequence  composition  (Xiao  and  Lis,  1988)  and  numbers  of  HSE  (Kraus 
et  a?.,  1988). 

A  second  set  of  possible  improvements  relate  to  the  temporal  and 
tissue-specific  induction  of  mosquito  heat-shock  promoters.    Heat  shock 
promoters  that  contain  very  abundant  HSE  have  been  identified  whose 
expression  is  modulated  in  tissue-  and  developmentally-specific  ways  by 
other  promoter  sequences.    These  other  sequences  may  not  be  apparent, 
nor  conserved  in  other  genera,  e.g.  the  ecdysone  response  of  the  small 
heat-shock  genes  (Ireland  et  a1.,  1982)(Simon  and  Lis,  1987)  and  tissue- 
specificity  of  Hsp83  of  D.  melanogaster  (Xiao  and  Lis,  1989).  Though 
ubiquitous  expression  is  the  rule  using  a  D.  melanogaster  HspJO 
promoter,  tissue-specific  exceptions  are  observed  (Bonner  et  a/.,  1984), 
and  mosquito  HspJO  promoters  may  differ  advantageously  by  being  either 
more  or  less  specific. 

A  third  characteristic  of  the  mosquito  promoters  which  might  be 
exploited  is  the  divergent  orientation  of  the  genes.    The  presence  of 
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this  arrangement  in  both  mosquitos  and  DrosophiTa  spp.,  whether  due  to 
convergence  or  conservation,  suggests  that  it  has  functional 
significance.    The  function  may  be  to  promote  conversion  within  the  gene 
pair  (Leigh-Brown  and  Ish-Horowicz,  1981)  (to  be  discussed  further  in 
Chapter  4),  or  the  divergent  arrangement  may  allow  HSE  to  act 
simultaneously  on  two  different  genes.    There  is  precedent  for 
bidirectional  regulation  of  genes  from  common  HSEs  in  Dictyostelium 
(Zuker  et  a/.,  1984)  and  for  Caenorhabditis  elegans  heat  shock  promoters 
in  mouse  cells  (Kay  et  a/.,  1986).    The  extremely  close  proximity  of  the 
HSE  arrays  of  the  two  divergent  gene  pairs,  particularly  of  p70a,  may 
promote  cooperativity  between  DNA-binding  proteins  that  affect  both 
transcription  units  simultaneously.    In  contrast,  the  D.  melanogaster 
Hsp70  promoters  in  use  for  hybrid  gene  expression  are  single  upstream 
arrays,  derived  in  fact  from  loci  at  which  the  genes  are  in  tandem 
repeats,  unlike  Hsp70  genes  in  most  Drosophila  species. 

Another  possible  advantage  of  maintaining  the  divergent 
orientation  might  be  to  exclude  regulatory  proteins  from  interfering 
with  proper  expression.    For  example,  the  D.  melanogaster  Hsp83  promoter 
contains  regions  upstream  of  the  TATA  box-proximal  HSE  that  are 
responsible  for  tissue  and  temporal -specif ic  expression.    If  these  are 
deleted,  regulation  is  similar  to  Hsp70  (Xiao  and  Lis,  1989). 

One  might  object  that  no  improvement  would  be  made  in  gene 
expression  by  using  promoters  with  HSE  at  a  distance  of  greater  than  100 
bp  since  HSE  beyond  that  distance  are  not  considered  to  be  necessary  to 
induce  the  heat-shock  response  in  transgenic  animals  (Pelham, 
1982)(Corces  and  Pellicer,  1984).    However,  HSE  do  occur  beyond  that 
distance  both  in  Drosophila  and  A.  albimanus  (Figures  3-14  and  3-18)  and 
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have  been  shown  to  be  important  in  regulation  of  the  small  heat  shock 
protein  genes  (Simon  and  Lis,  1987). 

It  is  possible  to  address  the  functional  significance  of  the 
divergent  orientation  using  either  D.  mel anogaster  or  A.  albimanus 
symmetrical  promoter  regions  to  control  the  expression  of  two  different 
divergently  transcribed  reporter  genes.    Constructs  such  as  this  could 
be  altered  by  varying  the  distance  separating  the  genes,    mutating  one 
half,  or  creating  absolute  symmetry  without  an  intervening  diverged 
region.    These  hybrid  genes  could  then  be  transfected  into  cultured 
cells,  assayed  by  transient  expression,  or  introduced  into  Drosophila  by 
P-element  transformation  (Rubin  and  Spradling,  1982).    Variable  results 
due  to  chromosomal  location  would  tend  to  affect  each  reporter 
similarly,  but  could  be  controlled  further  by  comparing  different 
transformed  lines  carrying  the  same  construct.    Reporter  differences 
could  be  controlled  by  placing  either  gene  in  both  positions  relative  to 
the  asymmetries  of  the  promoter.    These  experiments  have  the  potential 
to  reveal  subtle  effects  on  Hsp70  promoter  activity  that  have  been 
overshadowed  by  the  large  effects  due  to  the  HSE. 

The  mosquito  heat  shock  promoters  that  I  have  isolated  have 
potential  for  improving  hybrid  gene  expression.    The  considerations 
above  clearly  dictate  that  this  will  be  resolved  only  by 
experimentation. 
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Figure  3-1.    HSE  Scoring  Matrices  For  Use  in  the  Program  WEIGHTS.  The 
upper  matrix  is  from  Xiao  and  Lis  (1988).    Only  scores  above  200  were 
saved.    The  lower  matrix  is  based  on  the  general  consensus  of  Pelham 
(1982)  using  a  score  cutoff  of  70.    In  the  latter  matrix,  each  position 
IS  weighted  equally.    The  third  matrix  of  Amin  et  al .  (1988)  is  based  on 
nucleotide  frequencies  in  a  38  bp  region  of  recognized  heat  shock 
promoters  from  several  species.    The  minimum  score  saved  in  that 
analysis  was  1100.    Weighting  tables  for  sequence  analysis  by  WEIGHTS 
should  be  created  as  text  files  in  the  form  shown  above.  Fractional 
values  may  be  used  as  well. 
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Figure  3-l--continued. 


10  KEY  OFF:  CLS 
20  DS  =  5020 


'Dimensions  length  of  sequence  array 
'Weight  array  length 
'Weight  array  width 
'Dimensions  the  score  output  array 
'Dimensions  the  sequence  array 
'Dimensions  the  weight  array 


30  DWR  =  50 
40  DWC  =  6 


50  DIM  SCORE(DS) 
60  DIM  SEQ$(DS) 
70  DIM  WGT(50,  6) 


80  Z  =  0 

100  INPUT  "ENTER  WEIGHT  TABLE  FILE  NAME:  ",  QFM$ 

110  OPEN  QFM$  FOR  INPUT  AS  #2 

120  IF  E0F(2)  THEN  GOTO  300 

130  LINE  INPUT  #2,  TITLES 

140  IF  E0F(2)  GOTO  300 

150  LINE  INPUT  #2,  LETRS$ 

160  IF  E0F(2)  GOTO  300 

170  LINE  INPUT  #2,  DASH$ 

180  IF  E0F(2)  GOTO  300 

190  PRINT  TITLES 

200  PRINT  LETRSS 

210  PRINT  DASH$ 

220  IF  E0F(2)  GOTO  300 

230  Z  =  Z  +  1 

240  LINE  INPUT  #2,  REC$ 

250  PRINT  REC$ 

260  FOR  X  =  1  TO  DWC 

270  WGT(Z,  X)  =  VAL(MID$(REC$,  (X  *  6)  +  1,  4)) 
280  NEXT  X 
290  GOTO  220 
300  Y  =  0 
310  PRINT  "" 

320  INPUT  "ENTER  SEQUENCE  FILE  NAME  :  ",  FLN$ 

330  INPUT  "ENTER  SCORE  OUTPUT  FILE  NAME  :  ",  OFN$ 

340  INPUT  "WHAT  IS  THE  MINIMUM  SCORE  YOU  WANT  SAVED?  :  ",  MN 

350  PRINT 

360  OPEN  FLN$  FOR  INPUT  AS  #1 
370  IF  EOF(l)  GOTO  450 
380  LINE  INPUT  #1,  REC$ 
390  E  =  LEN(REC$) 
400  FOR  X  =  1  TO  E 
410  Y  =  Y  +  1 

420  SEQ$(Y)  =  MID$(REC$,  X,  1) 

430  NEXT  X 

440  GOTO  370 

450  A  =  Y  +  1  -  Z 


Figure  3-2.    WEIGHTS.    The  program  WEIGHTS  was  written  in  Quick  Basic 
(Tm  Microsoft)  and  was  designed  to  score  windows  of  DNA  sequence 
relative  to  a  user-defined  weight  matrix.    Input  files  should  consist 
only  of  the  DNA  sequence  in  a  text  file  and  the  weight  matrix  file 
should  be  formatted  exactly  as  the  examples  in  Figure  3-1 
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460  B  =  Z  - 
470  FOR  NUC 
480  FOR  ROW 
490  C  =  NUC 
500  D  =  ROW 


1  TO  A 
0  TO  B 
ROW 
1 


510  IF  INSTRC'Gg", 
520  IF  INSTRC'Aa", 
530  IF  INSTRC'Tt", 
540  IF  INSTRC'Cc", 
550  IF  INSTRC'AaGg* 
560  IF  INSTRC'CcTt' 
570  NEXT  ROW 
580  NEXT  NUC 


SEQ$(C)) 
SEQ${C)) 
SEq$(C)) 
SEQ${C)) 
,  SEq$(C)) 
,  SEQ$(C)) 


0  THEN  SCORE(NUC)  = 
0  THEN  SCORE(NUC)  = 
0  THEN  SCORE(NUC)  = 
0  THEN  SCORE(NUC)  = 

>  0  THEN  SCORE(NUC) 

>  0  THEN  SCORE(NUC) 


SCORE{NUC)+WGT(D,  1) 
SCORE(NUC)+WGT(D,  2) 
SCORE(NUC)+WGT(D,  3) 
SCORE(NUC)+WGT(D,  4) 
=  SCORE(NUC)+WGT(D,  5) 
=  SCORE(NUC)+WGT(D,  6) 


";  OFN$ 
FLN$ 

";  QFM$ 
;  MN 


590  OPEN  OFN$  FOR  APPEND  AS  #3 
600  PRINT  "Sequence  output  file: 
610  PRINT  "Sequence  filename:  "; 
620  PRINT  "" 

630  PRINT  "Weighting  table  used: 
640  PRINT  "Minimum  score  saved  ' 
650  PRINT  "Number  of  nucleotides  examined 
660  PRINT  "" 

670  PRINT  "POSITION  SCORE" 

680  PRINT  "  " 

690  PRINT  #3,  "Sequence  output  file 
"Sequence  filename:  " 


";  OFN$ 
FLN$ 


700  PRINT  #3, 
710  PRINT  #3, 

720  PRINT  #3,  "Weighting  table  used:  ";  QFM$ 

"Minimum  score  saved    ";  MN 

"Number  of  nucleotides  examined; 
II II 


730  PRINT  #3, 
740  PRINT  #3, 
750  PRINT  #3, 
760  PRINT  #3, 
770  PRINT  #3, 
780  FOR  Q  =  1 
790  TOT  =  TOT 


'POSITION 


SCORE' 


TO  A 

+  SCORE(Q) 

800  IF  SCORE(Q)  >=  MN  THEN  PRINT  Q,  SCORE(Q) 
810  NEXT  q 
820  AVG  =  TOT  /  A 
830  FOR  Q  =  1  TO  A 

840  SUM  =  SUM  +  {{SCORE(q)  -  AVG)  ^  2) 
850  NEXT  Q 

860  STDEV  =  SqR(SUM  /  (A  -  1)) 

870  PRINT  #3,  "  " 

880  PRINT  #3,  "Average  score  value    ";  AVG 

890  PRINT  "Average  score  value     ";  AVG 

900  PRINT  #3,  "Score  standard  deviation    ";  STDEV 

910  PRINT  "Score  standard  deviation     ";  STDEV 

920  PRINT  #3,  "End  of  scan." 


PRINT  #3,  q,  SCORE(q) 


Figure  3-2--continued. 


930  PRINT  "End  of  scan. 
940  PRINT  #3,  "" 
950  PRINT  #3, 

960  CLOSE 
970  SYSTEM 


Figure  3-2--continued. 
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Figure  3-7.    In  Situ  Hybridizations.    Biotinylated  probes  were 
hybridized  to  the  salivary-gland  polytene  chromosomes  of  A.  albimanus. 
Panels  and  probes  used:  A  and  B,  p70a.l6  which  cross-hybridizes  to  p70a 
and  p70b  and  in  the  chromosomes  to  both  IIC  and  13C;  C  and  D,  p70a.dl 
which  is  p70a-specific  hybridizes  only  to  one  locus  13C;  E,  p70b.5  is 
p70b-specific  and  hybridizes  only  to  IIC;  F  and  G,  probed  with  p83a 
which  hybridizes  only  to  40A. 
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Figure  3-10.    RNA  Profiles.    Relative  induction  of  p70a. 16-hybridizing 
transcripts  upon  heat  shock  for  30  minutes  at  various  temperatures 
(upper  graph)  or  over  time  (lower  graph).    Methods  for  obtaining  the 
values  are  discussed  in  Materials  and  Methods. 
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Table  3-1. 
Features  of  p70a  and  p70b  Clones 

p70a  p70b 


Left  Gene    Right  Gene  Left  Gene    Right  Gene 


ORF^ 

2068-146 

3037-4959 

1506-1 

2706-4628 

TATA  Box 

2321-2315 

2773-2779 

1719-1712 

2493-2500 

Transc.  Start 

2290 

2805 

1687,1690 

2522,2525 

Transl .  Start 

2068 

3037 

1506 

2706 

PolyA  Signal*" 

19-14 

?" 

? 

4636 

"Distances  listed  indicate  the  position  on  the  sequence  in  Figures  3-12 
and  3-13  and  are  listed  numerically  according  to  the  putative  direction 
of  transcription. 

''Polyadenylation  sites  should  be  considered  tentative. 

•"Question  marks  indicate  no  clear  polyadenylation  sites  were  observed. 
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TATTTGGTTAACATTTATTAACATTCGCTGATAATATATCATACTGATGGCAACATTATG 

1   +  +  +  ^  ^  ^  gQ 

Eco  RI 

AGCCCATAATTTATCATGAAIICATCTAGTCTTAGAGTCTTTGTTAATCTTCTCTCCTAG 
61   +  +  +  +  ^  ^  J20 

TTTAAAGTTGGCTGAACATTGATATTTAGTCCACCTCTTCCACCGTCGGTCCCGTCCTTC 
121   +  +  +  ^  ^  ^  jQO 

EndAspVal  Gl  uGl  uVal  ThrProGlyThrArgGly 

CGCCGAATCCTCCAGCTTGCTGTCCACAGCTGGTTGGTTGCGGACCACCAGCCGCTTGCT 
^  .       181   +  +  +  +  +  ^  240 

GlyPheGlyGlyAlaGlnGlnGlyCysSerThrProGlnProGlyGlyAlaAlaGlnGln 

Pst  I 

,        '  .  GATGCAGTTTGGTCATGATGGGACIGCAGACCCGCGACAACTCTTGCATTTGGTGCTCGT 

241   +  +  +  +  ^  300 

Hi sLeuLysThrMet II eProSerCysVal  ArgSerLeuGl  uGl  nMetGl  nHi sGl  uTyr 

ACTCTTCCTTTTCCGCCATTGTGTTGCCATCGATCCATCGCAGAGTCTCGTCGCATCGAT 
301   +  +  ^  ^  _^_  ^ 

Gl uGl uLysGl uAl aMetThrAsnGlyAspIl eTrpArgLeuThrGl uAspCysArgAsp 

CCTGCACCGTTCTGCGATCGGCTTCGCTGAGTTTGCTCGATCCTTCTCCGTCCAGGGATT 
361  +  +  ^  ^  _^  ^ 

GlnValThrArgArgAspAlaGluSerLeuLysSerSerGlyGluGlyAspLeuSerGln 

Xho  I 

GTTTCAGGTTGAAGCAGTATGCCICGAGCTGATTGCGTGCGGCAATGGCCTCTCGCTGCT 
421  +  +  ^  ^  ^  ^ 

LysLeuAsnPheCysTyrAl aGl uLeuGl nAsnArgAl aAl all eAl aGl uArgGl nLys 

TCTCATCCTCCTCGCGGTACTTTTCGGCCTCCGATACCATTCGATCGATGTCGGCCTGCG 
481   +  +  +  ^  ^  ^ 

Gl uAspGl uGl uArgTyrLysGl uAl aGl uSerValMetArgAspIl eAspAl aGl nSer 

ATAGGCGACCTTTATCGTTCTTGATCGTGATATTCTTCTCTTTTCCGCTGCTCTTATCCT 
541    +  +  +  ^  ^  ^ 

LeuArgGlyLysAspAsnLysIleThrlleAsnLysGluLysGlySerSerLysAspLys 


Figure  3-12     DNA  Sequence  of  p70a  Right  and  Left  Genes  and  Spacer 
af^rn\ho^lS''^'°"  sites  mapped  and  shown  in  Figure  3-3  are  underlined 
as  are  the  TATA  boxes  and  transcription  start  sites.    The  sequence 
begins  with  the  transcribed  strand  of  the  left  gene.  Heat-shock 
element-like  sequences  are  shaded  and  are  underlined  where  they  overlap. 
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TGGCTGCGACGTTCAGGATTCCGTTTGCGTCCAGATCGAAAGTCACCTCGATCTGCGGTA 
601   +  +  +   +  +   +  660 

Al aAl aVal AsnLeull eGlyAsnAl aAspLeuAspPheThrVal Gl ull eGl nProVal 


CACCACGTGGGGCCGGCGGGATGCCCGAGAGGTCGAACTGTCCCAAAAGATTGTTGTCCT 
661      -  -+  +  +  +  +---  .+  720 

GlyArgProAl aProProIl  eGlySerLeuAspPheGl nGlyLeuLeuAsnAsnAspLys 

TGGTCATGGCTCGCTCTCCTTCGAATACCTGGATCGAGACTCCGGGCTGGTTGTCGGCGT 
721   +  +  +  +  ^  ^  780 

ThrMetAl  aArgGl uGlyGl uPheVal 61  nil eSerVal GlyProGl nAsnAspAl  aTyr 
Bgl  II 

ACGTCGAGAAGAICITCGTCTGTTTGCAAGGAATGCGCGAGTTGCGTTCAATCAGCTTCG 
781   +  +  +  ^  ^  ^  g^Q 

ThrSerPhelleLysThrGlnLysCysProIleArgSerAsnArgGluIleLeuLysThr 

TCATCACACCTCCGGCCGTCTCGATGCCAAGCGACAATGGAGCGACATCCACCAGCAGCA 
841   +  +  +  +  ^  ^  goo 

MetVal GlyGlyAl aThrGl ull eGlyLeuSerLeuProAl  aVal  AspVal  LeuLeuVal 

CGTCCTGAATCTTGTCATCCTTGTCGCCGCTAAGGATGGCCGCTTGCACCGCAGCACCGT 
901   +  +  +  ^  ^  ^  ggO 

AspGl nil eLysAspAspLysAspGlySerLeuIl eAl aAl aGl nVal  Al aAl aGlyTyr 

ATGCTACCGCTTCGTCCGGGTTGATCGAAAGGTTCAACGACTTTCCAGCGAAGAAGTTCT 
961   +  +  +  ^  ^  ^  J020 

Al aVal Al aGl uAspProAsnll eSerLeuAsnLeuSerLysGlyAl aPhePheAsnGl n 

GCAACAGGGACTGCACCTTCGGTATGCGAGTTGAGCCTCCTACCAGGACGATATCGTGAA 
1021     ---------+  +  +  ^  ^  ^  JQgQ 

LeuLeuSerGl nVal LysProIl eArgThrSerGlyGly Val LeuValll eAspHi sll e 

TGGAGCTCTTATCCATCTTCGCATCGGACAGAGCCTTCTCCACCGGCTGCAACGTCGAAC 
1081  +  +  +  ^  ^  ^  jj^Q 

SerSerLysAspMetLysAl aAspSerLeuAl aLysGl uVal ProGl nLeuThrSerArg 

GGAACAGGTCCGAGCATAGCTCCTCGAATCGTGCCCGGCTGATCTTCGTGTAATAATCGA 
1141   +  +  +  ^  ^  3— -j-  J200 

PheLeuAspSerCysLeuGl uGl uPheArgAl aArgSerll eLysThrTyrTyrAspIl e 
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Xho  I 

TGCCATCCATCAGGGCGTCAATCTCGATCGTTGCTTCCGTGCTCGAGGACAACGTGCGCT 

1201   +   +  +  +  +  +  1260 

GlyAspMetLeuAl aAspI 1 eGl ull eThrAl aGl uThrSerSerSerLeuThrArgLys 


TTGCCCGTTCACATGCCGTTCTCAAACGCCGCAGGGCTCGGGCATTCTTCGACAGATCCT 

1261   +--  --+--  +   +  +  +  1320 

Al aArgGl uCysAl aThrArgLeuArgArgLeuAl aArgAl aAsnLysSerLeuAspLys 

Eco  RI 

TCTTGAATTTCCGTTTGAATTCCTCCACGAAGTGAGCCACCATCCGGTTGTCAAAGTCTT 

1321   +  +  +  +  +  +  1380 

LysPheLysArgLysPheGl uGl uVal PheHi  sAl aVal MetArgAsnAspPheAspGl u 

CGCCTCCTAGATGAGTGTCTCCAGCAGTAGCACGCACTTCGAACAGCGATCCCTCGTCGA 
1381   +  +  +.  ...+  +  +  1440 

GlyGlyLeuHi  sThrAspGlyAl aThrAl aArgVal Gl uPheLeuSerGlyGl uAspIl e 

TCGTCAGGATGGAAACGTCGAAGGTTCCGCCACCCAGATCGAAGATCAGCACGTTCCGTT 
1441   --+  +  --+  +  +  +  1500 

ThrLeuIleSerValAspPheThrGlyGlyGlyLeuAspPhelleLeuValAsnArgGlu 

CTCCCTTCAGGTTCTTATCCAAGCCGTACGCCAGAGCTGCCGCCGTCGGTTCGTTGATGA 
1501   +  +  +  +  +  +  1550 

GlyLysLeuAsnLysAspLeuGlyTyrAlaLeuAlaAlaAlaThrProGluAsnllelle 

TGCGCATCACATTCAAGCCAGCGATGGCTCCAGCATCCTTTGTGGCCTGTCGCTGACTGT 
1561   +  +  +  +.    ^  1520 

ArgMetVal AsnLeuGlyAl all eAl aGlyAl aAspLysThrAl aGl nArgGl nSerAsp 

CGTTGAAGTAGGCTGGTACTGTGATGACTGCATTTTTCACTGACTGTCCCAAGTAGGCTT 
1621   +  +   +  +  +  +  1680 

AsnPheTyrAl aProVal Thrll eVal Al aAsnLysVal SerGl nGlyLeuTyrAl aGl u 

CGGCGGTTTCCTTCATCTTCGTCAGGACCATTGAACTGATTTCCTCGGGGGCAAAGGTTT 
1681   +  +  +  +  +  ^  1740 

Al aThrGl uLysMetLysThrLeuVal MetSerSerll eGl uGl uProAl aPheThrLys 

TGCGCTCGCCCTTGAACTCGACACGGATCTTGGGTTTGCCGCAATCGTTCACCACCGTGA 

17^1    --- + +--  -+  +  +--  -+  1800 

ArgGluGlyLysPheGluValArglleLysProLysGlyCysAspAsnValValThrPhe 
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ATGGCCAGTGCTTCATATCGGCCTGGATCTTCGGATCATCGAATTTGCGTCCAATCAACC 

1801   +  +  +  +  +  +  1860 

ProTrpHisLysMetAspAl aGlnlleLysProAspAspPheLysArgGlylleLeuArg 


GCTTGGCATCGAATACCGTGTTGGTCGGATTCATGGCCACCTGGTTCTTGGCTGCATCTC 

1861   +  +  --+  +  -+  +  1920 

LysAl aAspPheVal ThrAsnThrProAsnMetAl aVal G1 nAsnLysAl aAl aAspGly 

CGATGAGCCGCTCCGTGTCCGAAAAGGCAACGTAGCTCGGTGTTGTTCGGTTGCCCTGGT 
1921   +  +  +  +   +...  +  1980 

I 1 eLeuArgGl uThrAspSerPheAl aVal TyrSerProThrThrArgAsnGl yGl nAsp 

CGTTTGCGATGATCTCCACCTTTCCATGCTGGAACACACCCACGCACGAGTACGTGGTGC 

1981    --  +  +  +   +  +---  +  2040 

AsnAl all  ell eGl uVal LysGlyHi  sGl nPheVal GlyVal CysSerTyrThrThrGly 

Start  Codon 

CCAGGTCAATTCCAATTGCAGACGGCAITCTGTGTTTGTTGCTCTCGATGTTTTCTCTCA 

2041   +  +  +  +   +--  +  2100 

LeuAspIl eGlyll eAl aSerProMet 


GAAATCTCGATAATACTTCACTTGTTGCACTTGAAACTGTGTGTTGTAACTGATTCACTT 
2101   +  +  +  +.  +   +  2160 


TCTAATGCACTGATTACTTGACTTTTATCTCTCTTGGTGATAAG6GATTCTATCTTTCGT 
2161   +  +  +  +  -+  +  2220 

ATCTTCACGTGTTAGCTTCGCGCCGTTCTTGGCTCTCTTGCTTGCTTGTTCGCTTGTTTG 
2221    --  -+  +  +  +  +  +  2280 

Transcription  Start  TATA  Box 

TGTTCAACTGACAGTGGCTGCTCGAACTGCTCGGTIIIAWGAAACCACTTGCATTlii 
2281   +  +  --+  +  +  ..r:+  2340 

Pal indrome  <   

GAAAGTACGMACAGTTMmWGCIlGAMTGATCGAGATTCTGCTCGA 
2341   +  +  +  +  ^  2400 

 > 

AGAATGTCCCTAGCAGCTGCGCCTTTGCTGTCTTGCGTGCGGT86AAATTTCTGSTTTCA 
2401  +  +---  -+  +-  —  —  -+.--......+  2460 
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Palindrome<  Bgl  II-- 

CAAGAAAGTTTCGTAGAGATGAMGACCACTGGAATCATGTGSGATTCTTGTA6ATCTAG 
2461   +  +  --+  +  +  --+  2520 

 > 

ACAATCTGTCATCATAAATATGGTTGGCCATACGTTGTTAATGTAACGCTCTCTGGAAAC 
2521   +  --+  +--  +  +  +  2580 


TAACTSCCTTGCAACAGCCGTTCGCATCACCACAGAACTTTTCCCGAAACCAATCATCAC 
2581    -  +--  --+  -+  +  +  +  2640 


X  ACGCAAGACAGTTGGGCCGCllCGGACGTtCTACGGGTATCGAGCAGAATTTAGAGCTCT 

2641   +  +-  +  +  -+  +  2700 

J   <     ^  <  --  -  >  Palindrome 


2701    —  +  +.-.:.....+...  .  +   .+  2760 

.    /  TATA  Box  Transcription  Start 

GCAAGTGGTTTCAIAIMMGCGAGCAGTTCGAGCAGCCACCGTCAGTTGAACACAAACA 
2761   +-  +  +  +  +  +  2820 

AGCGAACAAGCAAGCAAGAGAGCCAAGAACGGCGCGAAACTAACACGTGAAGATACGAAA 
2821   +  +  +  +  +  +  2880 

GATAGAATCCCTTATCACCAAGAGAGATAAAAGTTAAGTAATCAGTGTATCAGAAAGTGA 
2881   +  +  +  +  ...+...  .+  2940 

ATCAGTTACAACACACAGTTTCAAGTGCGACAAGTGAAGTATTATCGAGATTTCTGAGAG 
2941   +  +  +-  +  +  +  3000 

Start  Codon 

AAAATATCGAGACCAAGTTAGAGCAACAAACACAGAAIGCCGTCTGCAATCGGAATTGAC 
3001   +  +-  -+  +  +  +  3060 

MetProSerAlalleGlylleAsp 

CTGGGAACCACGTACTCGTGCGTGGGTGTGTTCCAGCATGGAAAGGTGGAGATCATCGCA 
3061   +--  -+  +  +  ^  ^  3J20 

LeuGlyThrThrTyrSerCysVal GlyVal PheGl nHi  sGlyLysVal Gl ull ell eAl a 


AACGACCAGGGCAACCGAACAACGCCGAGCTACGTTGCCTTTTCGGACACGGAGCGCCTC 
3121  +  +  +  ^  ^  ^  3jgQ 

AsnAspGl pGlyAsnArgThrThrProSerTyrVal Al aPheSerAspThrGl uArgLeu 
Figure  3-12--continued. 
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ATCGGAGATGCAGCCAAGAACCAGGTGGCCATGAATCCGACCAACACGGTGTTTGATGCC 

3181   +  +  +-  +--  +  +  3240 

II eGlyAspAl aAl aLysAsnGl nVal Al aMetAsnProThrAsnThrVal PheAspAl a 


AAGCGGCTGATTGGACGAAAATTCGATGATCCGAAGATCCAGGCCGATATGAAGCACTGG 

3241    +  +  +   +  +--  +  3300 

LysArgLeuIl eGlyArgLysPheAspAspProLys II eGl nAl aAspMetLysHi  sTrp 

CCATTCACGGTGGTGAACGATTGCGGCAAACCCAAGATCCGCGTCGAGTTCAAGGGCGAG 
3301   +  +  +  +  +  +  3360 

ProPheThrValValAsnAspCysGlyLysProLysIleArgValGluPheLysGlyGlu 

CGCAAAACCTTTGCCCCCGAGGAAATCAGTTCAATGGTCCTGACGAAGATGAAGGAAACC 

3361   +--  --+  +   +   +  +  3420 

ArgLysThrPheAl aProGl uGl u II eSerSerMetVal LeuThrLysMetLysGl uThr 

GCCGAAGCCTACTTGGGACAGTCAGTGAAAAATGCAGTCATCACAGTACCAGCCTACTTC 

3421   +  +  +  +  +--  +  3480 

Al aGl uAl aTyrLeuGlyGl nSerVal LysAsnAl aVal II eThrVal ProAl aTyrPhe 

AACGACAGTCAGCGACAGGCCACAAAGGATGCTGGAGCCATCGCTGGCTTGAATGTGATG 
3481   +  +..  ..+  +  +  +  3540 

AsnAspSerGl nArgGl nAl aThrLysAspAl aGlyAl all eAl aGlyLeuAsnValMet 

CGCATCATCAACGAACCGACGGCGGCAGCTCTGGCGTACGGCTTGGATAAGAACCTGAAG 
3541   +  +  +--  +  +  +  3600 

Argil  ell eAsnGl uProThrAl aAl aAl aLeuAl aTyrGlyLeuAspLysAsnLeuLys 

GGAGAACGGAACGTGCTGATCTTCGATCTGGGTGGCGGAACCTTCGACGTTTCCATCCTG 
3601   +  +  +  +  +  +  3660 

GlyGluArgAsnValLeuIlePheAspLeuGlyGlyGlyThrPheAspValSerlleLeu 

ACGATCGACGAGGGATCGCTGTTCGAAGTGCGTGCTACTGCTGGAGACACTCATCTAGGA 
3661   +  +  +  +  +  +  3720 

Thrll eAspGl uGlySerLeuPheGl uVal ArgAl aThrAl aGlyAspThrHi  sLeuGly 

EcoRl 

GGCGAAGACTTTGACAACCGGATGGTGGCTCACTTCGTGGAGGAATTCAAACGGAAATTC 
3721   +  +  +  +  ^.  ^  37QQ 

GlyGl uAspPheAspAsnArgMetVal Al aHi  sPheVal Gl uGl uPheLysArgLysPhe 
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AAGAAGGATCTGTCGAAGAATGCCCGAGCCCTGCGGCGTTTGAGAACGGCATGTGAACGG 

3781   +  ---+  --+  --+  +  +  3840 

LysLysAspLeuSerLysAsnAl aArgAl aLeuArgArgLeuArgThrAl aCysGl uArg 

Xho  I 

GCAAAGCGCACGTTGTCCTCGAGCACGGAAGCAACGATCGAGATTGACGCCCTGATGGAT 

3841   -+  +  +  +  +  +  3900 

Al aLysArgThrLeuSerSerSerThrGl uAl aThrll eGl ull eAspAl aLeuMetAsp 

C/a  I 

GGCATCGATTATTACACGAAGATCAGCCGGGCACGATTCGAGGAGCTATGCTCGGACCTG 

3901    +   +---  +   +  +  +  3960 

GlylleAspTyrTyrThrLysIleSerArgAl aArgPheGluGluLeuCysSerAspLeu 


TTCCGTTCGACGTTGCAGCCGGTGGAGAAGGCTCTGTCCGATGCGAAGATGGATAAGAGC 

3961    +  +   +  +---  +  +  4020 

PheArgSerThrLeuGl nProVal Gl uLysAl aLeuSerAspAl  aLysMetAspLysSer 


TCCATTCAC6ATATCGTCCTGGTAGGAGGCTCAACTCGCATACCGAAGGTGCAGTCCCTG 

4021   +  +  +  +  +-  +  4080 

Serll eHi  sAspIl eVal LeuVal GlyGlySerThrArgll eProLysVal Gl nSerLeu 


TTGCAGAACTTCTTCGCTGGAAAGTCGTTGAACCTTTCGATCAACCCGGACGAAGCGGTA 
4081   +  +  +---  -+--  --+-  .+  4140 

LeuGl nAsnPhePheAl aGlyLysSerLeuAsnLeuSerll eAsnProAspGl uAl aVal 


GCATACGGTGCTGCGGTGCAAGCGGCCATCCTTAGCGGCGACAAGGATGACAAGATTCAG 

4141   +  +  +  +  ---+  +  4200 

Al aTyrGlyAl aAl aVal Gl nAl aAl all eLeuSerGlyAspLysAspAspLys II eGl n 


GACGTGCTGCTGGTGGATGTCGCTCCATTGTCGCTTGGAATCGAGACGGCCGGAGGTGTG 
4201   +  +  +  +  +  +  4260 

AspVal LeuLeuVal AspVal Al aProLeuSerLeuGlyll eGl uThrAl aGlyGlyVal 

Bgl  II 

ATGACAAAGCTGATTGAACGCAACTCGCGCATTCCTTGCAAACAGACGAAGATCTTCTC6 
4261   +  +--  +  +   +.  +  4320 

MetThrLysLeulleGluArgAsnSerArglleProCysLysGlnThrLysIlePheSer 


ACATACGCCGACAACCAGCCCGGAGTCTCGATCCAGGTGTTCGAAGGAGAGCGAGCCATG 

 +---  +  +  +  +  +  4380 

ThrTyrAl aAspAsnGl nProGlyVal Serll eGl nVal PheGl uGlyGl uArgAl aMet 
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ACCAAGGACAACAATCTTTTGGGACAGTTCGACCTCTCGGGCATTCCGCCGGCCCCACGT 

4381   +  +  +  +  +-  +  4440 

ThrLysAspAsnAsnLeuLeuGlyGlnPheAspLeuSerGlylleProProAlaProArg 

GGTGTACCGCAGATCGAGGTAACTTTCGATCTGGACGCAAACGGAATCCTGAACGTGGCA 

4441    +-  ---+  +  --+  +  +  4500 

GlyVal  ProGl  nil  eGl  uVal  ThrPheAspLeuAspAl aAsnGlyll eLeuAsnVal Al a 

GCCAAGGATAAGAGCAGCGGAAAGGAGAAGAACATCACGATCAAAAACGATAAAGGTCGC 
4501   +  +  +  +.  +  4550 

AlaLysAspLysSerSerGlyLysGluLysAsnlleThrlleLysAsnAspLysGlyArg 

CTATCGCA6GCCGACATCGATCGAATGGTATCG6AGGCCGAAAAGTACCGCGAGGAGGAT 
4561   +  +  ...+   +   +  4520 

LeuSerGl nAl aAspIl eAspArgMetVal SerGl uAl aGl uLysTyrArgGl uGl uAsp 

Xho  I 

GAGAAGCAGCGAGAGGCCATTGCCGCACGCAATCAGCICGAGGCATACTGCTTCAACCTG 
4621   +  +  +  +  +  ^  4580 

Gl  uLysGl  nArgGl  uAl  all  eAl  aAl  aArgAsnGl nLeuGl uAl aTyrCysPheAsnLeu 

AAACAATCCCTGGACGGAGAAGGATCGAGCAAACTCAGCGATGCCGATCGCAGAACGGTT 
4681   +  +  +  +  ^  ^  4740 

LysGlnSerLeuAspGlyGluGlySerSerLysLeuSerAspAl aAspArgArgThrVal 

CAAGATCGATGCGACGAGACTCTGCGGTGGATCGATGGCAACACTATGGCGGAGAAGGAA 
4741   ---+  +  +-  +   +  +  4800 

GlnAspArgCysAspGluThrLeuArgTrpIleAspGlyAsnThrMetAlaGluLysGlu 

Pst  I 

GAGTACGAGCACCAAATGCAAGAGTTGTCNCGGGTCI6CAGTCCCATCATGACCAAACTG 
4801   +  +  +  +  +  ^  4850 

Gl uTyrGl uHi  sGl nMetGl nGl uLeuSerArgVal CysSerProIl eMetThrLysLeu 

CATCAGCAAGCGGCTGGTGGTCCGCAACCAACCAGCTGTGGACAGCAAGCTGGAGGATTC 
4861   +  +--  +   +  4920 

Hi sGl nGl  nAl  aAl  aGlyGlyProGl nProThrSerCysGlyGl nGl nAl aGlyGlyPhe 

GGC6GAAGGACGGGACCGACGGTGGAAGAGGTGGATTAAAGATAACAATTGAAGATGCAT 
4^21   +  +  +  +  +  ^  4980 

GlyGlyArgThrGlyProThrVal Gl uGl uVal AspEnd 
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TTCCATGGCTTAACCAGAAACAACTGTCGATAGTGAA 
4981   +  +  .-+   5017 
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Sau  3A  Cloning  Site 

GAICGTGATATTCTTCTCCTTTCCGGTGCTCTTCTCCTTAGCTGCCACGTTCAGGATTCC 
1   +  +  ^  ^  ^  ^  gQ 

IleThrlleAsnLysGluLysGlyThrSerLysGluLysAl aAlaValAsnLeuIleGly 
Bst  EI  I 

GTTGGCATCCAGATCGAAGGTXACCTCGATCTGTGGCACACCACGTGGAGCCGGGGGAAT 

^1       "-" +  +  +  +  +  +  120 

AsnAl aAspLeuAspPheThrVal Gl ull eGl nProVal GlyArgProAl aProProIl e 

GCCCGAGAGGTCAAACTGTCCCAGAAGATTGTTGTCCTTGGTCATGGCTCGTTCTCCCTC 
121  +  +  +  ^  ^  ^ 

G lySerLeuAspPheGl nGlyLeuLeuAsnAsnAspLysThrMetAl aArgGl uGlyGl u 

Bgl  II 

6AACACCTGGATCGAAACGCCGGGCTGGTTGTCGGCGTATGTCGAGAAGATCTGCGTCTG 
181   +  +  ^  ^  ^  ^  240 

PheVal  Gl  nil  eSerVal  GlyProGl  nAsnAspAl aTyrThrSerPhell eGl nThrGl n 

TTTGCACGGAATGCGCGAGTTGCGCTCAATCAGCTTCGTCATCACACCTCCGGCCGTCTC 
241  +  +  +  +  ^_  

LysCysProIleArgSerAsnArgGluIleLeuLysThrMetValGlyGlyAlaThrGlu 


AATTCCAAGCGACAATGGAGCGACATCCACTAGCAGTACGTCTTGAATCTTATCGTCCTT 

 +  ^.  ^  _^  ^ 

II eGlyLeuSerLeuProAl aVal AspVal LeuLeuVal AspGl nil eLysAspAspLys 


GTCTCCGCTGAGGATGGCCGCCTGTACCGCTGCACCGTAAGCCACGGCCTCATCCGGATT 
351   +  +  +  ^  ^  ^ 

AspGlySerLeuIl eAl aAl aGl nVal Al aAl aGlyTyrAl aVal Al aGl uAspProAsn 

Pst  I 

421  ^^I^'^^^^J^^^'^^'^^^CTTTCCAGCGAAAAAGTTCIGCAGCAAGGACTGCACCTTCGG 
IleSerLeuAsnLeuSerLysGlyAlaPhePheAsnGlnLeuLeuSerGlnValLysPro 

GATGCGTGTGGAGCCTCCTACCAGGACGATATCGTGAATGGAGCTCTTATCCATCTTCGC 

'toi   1-  .j  1  _j   ^ 

IleArgThrSerGlyGlyValLeuVallleAspHisIleSerSerLysAspMetLysAla 


^^.r^K     ?*  •^'^^  Sequence  of  p70b.    The  sequence  of  the  p70b  left  qene 
from  the  cloning  site  and  the  entire  right  gene  are  shown  Restriction 
sites  shown  on  the  map  (Figure  3-4)  are  indicated  as  a?e  TATA  boJes 
transcription  starts  and  predicted  proteins.    Heat  shock-element  ?ikP 
sequences  are  shaded  and  underlined  where  they  overlap 


Pst  I 

ATCGGACAGAGCCTTTTCCACTGGCTGCAGCGTCGAACGGAACAAGTCAGAACACAGCTC 
541   +   .+  +  ^  ^  500 

AspSerLeuAl aLysGl uVal ProGl nLeuThrSerArgPheLeuAspSerCysLeuGl u 

Cla  I 

CTCGAATCGTGCCCG6CTGATCTTCGTGTAATAATCGATGCCATCCATCAGGGCGTCAAT 

601   +  +  +  +--  +  +  660 

GluPheArgAl  aArgSerlleLysThrTyrTyrAspIleGlyAspMetLeuAl  aAspIle 

Xho  I 

CTCGATCGTTGCCTCCGTGCICGAGGACAGTGTGCGCTTCGCCCTCTCGCATGCCGTTCT 

661   +   +  +   +  +  -+  720 

Gl ull eThrAl aGl uThrSerSerSerLeuThrArgLysAl aArgGl uCysAl aThrArg 

Eco  RI 

CAAACGACGCAGAGCGCGAGCGTTCTTCGACAGATCCTTCTTGTGCTTTCGTTTGAATTC 
721   +  +  +  +  +  +  780 

LeuArgArgLeuAl aArgAl aAsnLysSerLeuAspLysLysHi  sLysArgLysPheGl u 


TTCCACGAAGTGGCCCACCATTCGGTTATCGAAGTCTTC6CCTCCCAAATGAGTATCTCC 
781   +  +..  .+   +  +  ^  840 

Gl  uVal  PheHi  sGlyValMetArgAsnAspPheAspGl uGlyGlyLeuHi  sThrAspGly 

GGCCGTGGATCGTACCTCAAACAGTGATCCCTCGTCGATCGTCAGAATGGACACGTCGAA 
841   +  +  +  +  +  +  900 

Al aThrSerArgVal Gl uPheLeuSerGlyGl uAspIl eThrLeuIl eSerVal AspPhe 

GGTGCCGCCTCCCAGATCGAAGATCAGAACATTGCGTTCTCCCTTTAGGTTCTTATCCAA 
901   +  +  +  +  +  +  950 

ThrGlyGlyGlyLeuAspPhelleLeuValAsnArgGluGlyLysLeuAsnLysAspLeu 

6CCATACGCCAGAGCTGCTGCCGTCGGTTCGTTGATGATGCGCATCACATTCAGTCCAGC 
951   +  +  ^  ^  ^  ^  JQ20 

GlyTyrAl  aLeuAl  aAl aAl aThrProGl uAsnll elleArgMetVal AsnLeuGlyAl  a 

GATGGCTCCAGCATCCTTTGTGGCCTGTCGCTGGCTGTCGTTGAAGTAGGCTGGTACTGT 
1021   +  +  +  +  +  ^  J080 

11 eAl aGlyAl aAspLysThrAl aGl nArgGl nSerAspAsnPheTyrAl aProVal Thr 


GATGACTGCATTTTTTACTGACTGGCCCAGGTAGGCTTCGGCGGTTTCCTTCATCTTCGT 
 +  +  ^  ^  ^   ^ 

11 eVal Al aAsnLysVal SerGl nGlyLeuTyrAl aGl uAl aThrGl uLysMetLysThr 


Figure  3-13--continued. 


CAGCACCATCGAACTGATTTCCTCCGGGGCAAAGGTTTTGCGCTCGCCCTTGAACTCGAC 
1141   +  +  +--  +  +..  +  1200 

LeuValMetSerSerll eGl uGl uProAl aPheThrLysArgGl uGlyLysPheGl uVal 

GCGGATCTTGGGCTTACCACCGTCATTTACCACCGTGAATGGCCAGTGCTTCATATCGGC 
1201   +  +  +  +  +  +  1260 

Argil eLysProLysGlyGlyAspAsnVal ValThrPheProTrpHi  sLysMetAspAl a 

TTGGATCTTCGGATCGTCGAATTTGCGTCCAATCAGTCGCTTGGCATCGAACACCGTGTT 
1261   +  +  +  +  +  ^  1320 

GlnlleLysProAspAspPheLysArgGlylleLeuArgLysAl aAspPheValThrAsn 

AGTCGGATTCATGGCCACTTGGTTCTTGGCTGCATCTCCGATGAGTCGCTCAGTGTCCGA 
1321   +   +  +   +  ^  ^  13gQ 

ThrProAsnMetAl aVal Gl nAsnLysAl aAl aAspGly II  eLeuArgGl  uThrAspSer 

GAACGCAACGTAGCTCGGTGTCGTTCGGTTGCCCTGGTCGTTTGCGATGATCTCCACCTT 
1381   +  +  +  +  +  ^  144Q 

PheAl  aValTyrSerProThrThrArgAsnGlyGl nAspAsnAl all  ell eGl uVal Lys 


TCCATGCTGGAACACACCAACGCAGGAGTACGTGGTGCCCAGATCGATTCCGATTGCCGA 
 +  +  ^  ^  ^  ^ 

GlyHi sGl nPheVal GlyVal CysSerTyrThrThrGlyLeuAspIl eGlyll eAl aSer 


Start  Codon 

AGGCAITCTGTGTCTCTGTGGTTCAACTTCGATGAATATGCTTTCTCAAATCACTCAAAC 

d""-;'""'  ■+  + +  1560 

ProMet 

TGGTGTGCACAATTATACGCTTTCTGATGCAACAATTGATTCACTCTGGTCACTGCTTGT 
1561   +  +  +  +  ^  ^  jg20 

^^^"^^^CACTTTATTTTTCACGTGTTTGCACTTGTTACTCTCAGCTCGCTCAGATT 

— +  + — -+  +-  +  +  1680 

Transcription  Starts  TATA  Box  Xba  I 

CAAATTGACGACAGCTGCTCGAACGGACCGGIIIAIAIACCACACCACTCGATTTCTACA 

1681   +  +  ^  ^  ^....jrrrr;  1740 


BIJeMCACinCCAgAGCTCTCCGCTAGGCTACTCGAACGCGATGAGGGAGA 
i/ni  +  ^  ^  ^  ^  ^ 


1800 


Figure  3-13- 


-continued. 
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ATGCCGCGTTCTefiMATTTCTCeCGTACGAATCATCAAAGCGGACCCGGCTATTTTTAG 
1801   +  +  +  +  +  ^  I860 

CCAATCGCGTGCGTGATGATGGAAAACGCMGAATGTGCGAGAGGAGAGAGAGTGAGGTG 
1861   +  +  +  +  +  +  1920 

GACAAAAAATGTGTTTGCTTTTGAAAGTGTTTATTCCTCTTAACTTTTAACAACATTAAA 

1921   +  --  +  +  .  +   iggQ 

AGAATGCTGGATTTAATTTAACAGAATACATTTTCAACAAAGCAGCTTGTAGGTCACAAT 
1981   +  +  +  +_  2040 

GCGTTTATTATTATGATAAAGTGCATATAGTTAAGGAAAGCTATTAGAAAGGAATATTAA 
2041   +  +  +   +  ^  2100 

oin,  '•""•"'"■'■ATTGCACCTCAAGTTTGCGTAGGCTAACAATTGTTAGAATTATTTAAATTTGATTT 
2101   +   ^  ^  2160 

TAATAATATTTTGTTCACAACTTGCCCTGAAAAATTGATTTGAATGATCGTAAAATTTAT 
2161   +  +  ^    ^  ^  2220 

AAAACTGTTATTGAATAATCCGTTACGAGTTATGCGGAATAAATTAATAAATCAACATTC 
2221  +  +  +  +  _^_   ^  2280 

00O1  ^^™GTCCCTCCTCGCTCGCTCTCCTCTC6CACATTCII£CGTTTTCCATCATCACGC 
^^^1    "■■ + -— + + ---+  +--  +  2340 

ACGCGATTGGCTTAAAAATAGCCGGGTCCGCTTTGATGATTCGTACGC6A<SAeATTTCa 

- -+  + -+  +  --+  +  2400 

lAATGCGGCATACAATCTCCCTCATCGCGTTCGAGTAGCCTAGCGGAGAGCTGTfiGAAAe 
^4ui  +  +  +  ^  +.....::::+  2460 

Xba  I  TATA  Box 

TegtGAACCTTOMAAATCGAGTGGTGTGGIAIAIAAACCGGTCCGTTCGAGCAGCTG 
2461   +  +  ^  ^  ^  ^  2520 

Transcription  Starts 

TCGTCAATTTGAATCTGAGCGAGCTGAGAGTAACAAGTGCAAACACGTGAAAAATAAAGT 
2521   +  +  ^  ^  ^  ^  2580 

Figure  3-13--continued. 
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GTTTCAAAGTAACAAGCAGTGACCAGAGTGAATCAATTGTTGCATCAGAAAGCGTATAAT 
2581   +  +  +  +  +  +  2640 

TGTGCACACCAGTTTGAGTGATTTGAGAAAGCATATTCATCGAAGTTGAACCACAGAGAC 
2641   +  +  +  +  +  +  2700 

Start  Codon 

ACAGAAIGCCTTCGGCAATCGGAATCGATCTGGGCACCACGTACTCCTGCGTTGGTGTGT 
2701   +  +  +  +  +  ^  2760 

MetProSerAl all eGly I 1 eAspLeuGl yThrThrTyrSerCysVal GlyVal Phe 


TCCAGCATGGAAAGGTGGAGATCATCGCAAACGACCAGGGCAACCGAACGACACCGAGCT 
 +  +  ^  ^  ^  ^  2j 

G I nHi  sGlyLysVal Gl ull ell eAl aAsnAspGl nGlyAsnArgThrThrProSerTyr 


ACGTTGCGTTCTCGGACACTGAGCGACTCATCGGAGATGCAGCCAAGAACCAAGTGGCCA 
2821   +  +  4.  ^  ^  ^  2880 

Val Al aPheSerAspThrGl  uArgLeu II  eGlyAspAl aAl aLysAsnGl nVal Al aMet 

0001  ^^^^"^^^^A^TAACACGGTGTTCGATGCCAAGCGACTGATTGGACGCAAATTCGACGATC 
2881   +  +  +  +  ^  ^  2940 

AsnProThrAsnThrVal PheAspAl aLysArgLeuIl eGlyArgLysPheAspAspPro 

CGAAGATCCAAGCCGATATGAAGCACTGGCCATTCACGGTGGTAAATGACGGTGGTAAGC 
2941   +  ^  ^  ^  ^  _  

LysIleGlnAl aAspMetLysHisTrpProPheThrValValAsnAspGlyGlyLysPro 


CCAAGATCCGCGTCGAGTTCAAGGGCGAGCGCAAAACCTTTGCCCCGGAGGAAATCAGTT 
+  ^  ^  ^  ^  ^ 

Lys  n  eArgVal Gl uPheLysGlyGl uArgLysThrPheAl aProGl uGl uI1 eSerSer 


CGATGGTGCTGACGAAGATGAAGGAAACCGCCGAAGCCTACCTGGGCCAGTCAGTAAAAA 
3061   +  +  +  ^  ^  ^  2120 

MetVal LeuThrLysMetLysGl uThrAl aGl uAl aTyrLeuGlyGl nSerVal LysAsn 

ATGCAGTCATCACAGTACCAGCCTACTTCAACGACAGCCAGCGACAGGCCACAAAGGATG 
dl^l   +  ^  ^  ^  ^  _  ^  ^^^^ 

Al aVal II eThrVal ProAl aTyrPheAsnAspSerGl nArgGl nAl aThrLysAspAl a 


Table  3-13--continued. 
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CTGGAGCCATCGCTGGACTGAATGTGATGCGCATCATCAACGAACCGACGGCAGCAGCTC 
3181   +  +  +..  _+   +  3240 

Gl  yAl  a 1 1  eAl  aGl  y LeuAsn Val  Met Arg 1 1 e 1 1 eAsnGl uProThrAl aAl aAl aLeu 

TGGCGTATGGCTTGGATAAGAACCTAAAGGGAGAACGCAATGTTCTGATCTTCGATCTGG 
3241   +  +  +  +  +  +  3300 

Al aTyrGlyLeuAspLysAsnLeuLysGlyGl uArgAsnVal Leull ePheAspLeuGly 

GAGGCGGCACCTTCGACGTGTCCATTCTGACGATCGACGAGGGATCACTGTTTGAGGTAC 
3301   +  +  +  +  ^  ^  33g0 

GlyGlyThrPheAspValSerlleLeuThrlleAspGluGlySerLeuPheGluValArg 

GATCCACGGCCGGAGATACTCATTTGGGAGGCGAAGACTTCGATAACC6AATGGTGGGNC 
3361   +  +  +  +  +  +  3420 

SerThrAl aGlyAspThrHi  sLeuGlyGlyGl uAspPheAspAsnArgMetVal GlyHi  s 
Eco  RI 

ACTTCGTGGAAGAAIICAAACGAAAGCACAAGAAGGATCTGTCGAAGAACGCTCGCGCTC 
3421   +  +  +  +  _^_  ^  3^gQ 

PheVal Gl uGl uPheLysArgLysHi  sLysLysAspLeuSerLysAsnAl  aArgAl aLeu 

Xho  I 

TGCGTCGTTTGAGAACGGCATGCGAGAGGGCGAAGCGCACACTGTCCTCGAGCACGGAGG 
3481   +  +  +  +  +  ^  3540 

ArgArgLeuArgThrAl aCysGl uArgAl aLysArgThrLeuSerSerSerThrGl uAl a 

C7a  I 

CAACGATCGAAATTGACGCCCTGATGGATGGCATXGAITATTACACGAAGATCAGCCGGG 
3541   +  +  ^  ^  ^  ^  2600 

Thrll eGl ull eAspAl aLeuMetAspGlyll eAspTyrTyrThrLysIl eSerArgAl a 

Pst  I 

CACGATTCGAGGAGCTGTGTTCTGACTTGTTCCGTTCGACGCIGCAGCCAGTGGAAAAGG 
3501  +  +  +  ^  ^  ^  3ggQ 

ArgPheGl  uGl  uLeuCysSerAspLeuPheArgSerThrLeuGl nProVal Gl uLysAl a 

CTCTGTCCGATGCGAAGATGGATAAGAGCTCCATTCACGATATCGTCCTGGTAGGAGGGT 
3651    ---------+  +  +  _^.  ^  ^  3720 

LeuSerAspAlaLysMetAspLysSerSerlleHisAspIleValLeuValGlyGlySer 

Pst  I 

CCACACGCATCCCGAAGGTGCAGTCCTTGCIGCAGAACTTTTTCGCTGGAAAGTCTCTGA 
■^'21   +  +  +  +  ^  ^  3780 

ThrArgll eProLysVal Gl nSerLeuLeuGl nAsnPhePheAl aGlyLysSerLeuAsn 


Figure  3-13--continued. 
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ACCTTTCGATCAATCCGGATGAGGCCGTGGCTTACGGTGCAGCGGTACAGGCGGCCATCC 

3781   +---  +---  +  +  +  +  3840 

LeuSerll eAsnProAspGl uAl aVal Al aTyrGlyAl aAl aVal Gl nAl aAl all eLeu 


TCAGCGGAGACAAGGACGATAAGATTCAAGACGTACTGCTAGTGGATGTCGCTCCATTGT 
3841   +  +--  +  +  +  +  3900 

SerGlyAspLysAspAspLysIleGlnAspValLeuLeuValAspValAlaProLeuSer 


CGCTTGGAATTGAGACGGCCGGAGGTGTGATGACGAAGCTGATTGAGCGCAACTCGCGCA 
3901   +  +  +  +  +...  +  3960 

LeuGlyll  eGl  uThrAl  aGlyGlyValMetThrLysLeuIl eGl uArgAsnSerArgll e 
Bgl  II 

TTCC6TGCAAACAGACGCAGAICITCTCGACATACGCCGACAACCAGCCCGGCGTTTCGA 
3961   +  +---  +...  +  +  +  4020 

ProCysLysGl nThrGl n II ePheSerThrTyrAl aAspAsnGl nProGl yVal Ser II e 

TCCAGGTGTTCGAGGGAGAACGAGCCATGACCAAGGACAACAATCTTCTGGGACAGTTTG 

4021   +  +  +  +  +--  +  4080 

Gl nVal PheGl uGlyGl uArgAl aMetThrLysAspAsnAsnLeuLeuGlyGl nPheAsp 

Bst  EII 

ACCTCTCGGGCATTCCCCCGGCTCCACGTGGTGTGCCACAGATCGAGGTGACCTTCGATC 
4081   +  +  +  +-  __.+  +  4140 

LeuSerGlyll  eProProAl  aProArgGlyVal ProGl nil eGl uVal ThrPheAspLeu 

TGGATGCCAACGGAATCCTGAACGTGGCAGCTAAGGAGAAGAGCACCGGAAAGGAGAAGA 
4141   +  +   _^  ^   ^  4200 

AspAl aAsnGlylleLeuAsnVal  Al  aAl aLysGl uLysSerThrGlyLysGl uLysAsn 

ATATCACGATCAAGAACGACAAGGGTCGCCTATCGCAGGCCGATATCGATCGAATGGTGT 
4201   +  +  +  +  +  +  4260 

II eThrll eLysAsnAspLysGlyArgLeuSerGl nAl aAspIl eAspArgMetVal Ser 

CGGAAGCTGAGAAGTTCCGCGAGGAGGATGAGAAGCAACGCGAACGCATCTCTGCCCGCA 
4261   +  +  +  +  +-..  +  4320 

Gl uAl aGl ULysPheArgGl uGl uAspGl uLysGl nArgGl uArgll eSerAl aArgAsn 
Xho  I 

ATCAGCICGAGGCTTACTGCTTCAACCTGAAACAGTCGCTGGACGGCGAAGGAGCGAGTA 
4321   +  +  +  ^  ^  ^  43gQ 

Gl nLeuGl uAl aTyrCysPheAsnLeuLysGl nSerLeuAspGlyGl uGlyAl aSerLys 


Figure  3-13--continued. 


AACTCAGCGATGCCGATCGCAAGACAGTGCAGGATCGATGCGAAGAGACTCTGCGATGGA 
4381   +  +  +  +  +  +  4440 

LeuSerAspAl aAspArgLysThrVal Gl nAspArgCysGl uGl uThrLeuArgTrpIl e 

TCGACGGCAACACAATGGCCGATAAGGAGGAGTTCGAGCACAAGATGCAAGAGCTAACGA 
4441   +  +  +  +  +  +  4500 

AGCTGCCGTTGTGTTACCGGCTATTCCTCCTCAAGCTCGTGTTCTACGTTCTCGATTGCT 
AspGlyAsnThrMetAl aAspLysGl uGl uPheGl uHi  sLysMetGl nGl uLeuThrLys 

AGGCATGCAGCCCCATCATGACGAAACTGCACCAGCAGGCAGCTGGCGGGCCCTCGCCAA 
4501   +  +  +  ^  ^  ^  ^5gQ 

Al aCysSerProIl eMetThrLysLeuHi  sGl nGl nAl aAl aGlyGlyProSerProSer 

GCAGTTGCGCACAGCAAGCTGGAGGATTTGGAGGAAGGACGGGTCCGACAGTGGAAGAAG 
4561   +  +  +  +  ^  ^  4g2o 

SerCysAl aGl nGl nAl aGlyGlyPheGlyGlyArgThrGlyProThrVal Gl uGl uVal 

Putative  Polyadenylation  Signal 

TGGATTAAGGAGTAGAAAIMCGGAGATTTATAATTGATTCGAAGAGGATGGCATTGACT 

4621    ---------+  +  +  +  -+  +  4680 

AspEnd 

GAATATGATTACTCATATAGTATGTTCCTATG 
4681   +  +  .+..  4712 


Figure  3-13--continued. 
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Translation  Start 
ACACA-AATG 

87C  TGAATACTTTCAACAAG- - -TTACCGAGAAAGAAGAACTC  C . ~ 

87A   TCG  --  c.... 

p70a,  L        ATTTCTGAGAGAAAACATCG  AGAGCAACAA  G 

P70a.  R   T....AGACCAAGTT  G....* 

p70b ,  L        GTGATTTGAGAAAGCATATTCATCGAAGTTGAACCCAGAG  G 

P70b,  R    Q 


-40  -30  -20  -10 


Figure  3-15.    Translation  Start  Alignment.    Alignment  of  two  D 
'"elanogaster  Hsp70  (87C  Gene  1  and  87A  Distal)  and  She  J  albimanus 

mr?  ? odon'7  nSlSf-^'Jl  I''  '''''  ('^^  se  uences  upstream  0?  ?  e  ATG 
start  codon  (underlined)  is  shown.    A  region  of  conserved  seauence  i<: 

s  an  :  tT  '  :.hl:'"?'-.  is'relative  to'the  tr       t  o  ' 

sZenll'V.  tSe'fine^Sbov"!^  '^^^'^"^       ''''  ''^^  ^^-^  ^^-^^ 
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A. 

Consensus  (5  out  of  6)  AgTT-AAat-aAA-Aa-C-AAg-Ga-AACA 
Predominant  {>3  out  of  6)  AGTTCAAATCAAA-AATCAAAGTGAAAACA 


H sp22  A A • bCC*  bAwC* •C* •CT* • • • 

Hsp23  ....G..T  A.GC  C..T.... 

Hsp26  . .A. .G. . .TC. .A. .T.G.GCA.TG  

Hsp27  . . .CT. . .CTG. .A. .TTG. . .GC. . .CGT 

Hsp68  .T..G  CGT  

Hsp/O  .A  T  C..G  C... 

Hsp83  AGTCTTGAAAAAAATTTCGTACGGTGTGCG 


B. 

CAAACAAGC-AA 

B8        AGCGACAAT   AACACGTCGCTAAGCGAAAGCTAAGCAAATA 

87C       ... -A. .A.TCAATTCAAACAAGCAAAGTG  

87A   A....G....--  G  C. 

p70a , L  ACTGTCAGTTGAACACAAACAAGCGAACAAGCAAGCAAGAGAGCCAAGAACGGCGCGAAGC 
p70a,R  ..C..C  A. 

p70b  L  GTCGTCAATTTGAATCTGAGCGAGCTGAGAGTAACAAGTGCAAACACGTGAAAAATAAAGT 
p70b  R  ..C..C  


10  20  30  40  50 


Figure  3-15.    Transcription  Start  Alignments.    A.    The  consensus  and 
predominant  nucleotides  in  the  transcription  start  regions  of  six  D. 
melanogaster  heat-shock  genes  are  shown.    One  position  is  a  tie  for  "C" 
or  "A"  (hypenated  in  the  predominant  sequence  line).    The  sequences  and 
alignment  are  from  Hackett  and  Lis  (1983).    In  contrast,  1  have  shown 
the  sequence  of  Hsp83  which  is  not  preferentially  translated  under  heat 
shock.    B.    Alignment  of  three  D.  melanogaster  Hsp70  genes  (B8  Clone, 
87C  Gene  1  and  87A  Distal)  and  the  A.  albimanus  Hsp70  p70a  and  p70b  left 
(L)  and  right  (R)  promoters  downstream  of  the  TATA  box.    A  region  of 
sequence  shared  by  the  p70a,  87C  and  87A  genes  is  shown  above  the 
alignment.    Note  that  clone  B8  has  a  deletion  of  this  region  and  p70b  is 
not  similar.    The  p70a  sequence  consisting  of  repeats  of  C-A/G-A-G/A  is 
shown  in  bold  type.    Numbering  is  relative  to  the  transcription  start 
sites  shared  by  the  p70a  and  p70b  genes  with  the  start  being  1,  negative 
numbers  are  upstream,  and  positive  downstream. 
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Figure  3-18.    HSE  Distribution.    Spatial  and  quantitative  distribution 
of  scores  indicating  the  match  of  sequences  to  two  different  weighted 
criteria  for  heat-shock  elements.    Scores  and  positions  for  matches 
above  1100  (for  Amin  et  a7.{1988))  are  indicated  by  the  heights  of  the 
bars  and  positions.    Only  scores  in  the  200  bp  upstream  of  the  TATA  box 
are  shown.  R  and  L  indicate  the  right  and  left  genes  of  p70a  and  p70b. 
The  87A  and  87C  sequences  are  the  same  as  those  in  Figure  3-14.  Scoring 
matrices  and  the  scoring  program  WEIGHTS  are  included  in  Figures  3-1  and 
3  ~  2  • 


CHAPTER  4 

COMPARISON  OF  THE  DROSOPHILA  MELANOGASTER 

AND 

ANOPHELES  ALBIHANUS  HSP70  GENE  FAMILIES 
Introduction 

The  Hsp70  genes  of  Anopheles  albimanus  and  Drosophila  melanogaster 
are  members  of  families  comprised  of  at  least  four  genes  in  the  mosquito 
and  five  or  more  in  D.  melanogaster  (Ish-Horowicz  et  a/.,  1979).  In 
addition,  the  D.  melanogaster  family  includes  Hsp70  cognate  genes  {Hscl, 
Hsc2,  Hsc4)  which  have  similar  sequences,  but  unlike  the  Hsp70  genes, 
contain  introns  and  are  not  heat  inducible  (Ingolia  and  Craig, 
1982) (Craig  et  a/.,  1983).    Another  member,  the  Drosophila  HspSS,  is 
closely  related  to  the  D.  melanogaster  Hsp70  gene,  and  is  heat 
inducible,  but  is  expressed  at  much  lower  levels  (Holmgren  et  al., 
1979). 

In  this  study,  the  four  mosquito  genes  have  been  isolated  in  two 
clones  p70a  and  p70b,  each  of  which  contains  a  pair  of  Hsp70  genes  in 
divergent  orientation  (Chapter  3).  These  are  located  at  two  loci  on  the 
same  chromosome  arm,  about  20  centiMorgans  apart.  In  most  species  of 
Drosophila,  the  Hsp70  genes  also  occur  in  two  divergently  transcribed 
pairs  at  two  loci  (4  genes),  but  are  very  tightly  linked  (Leigh-Brown 
and  Ish-Horowicz,  1981). 
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The  similar  structure  and  sequence  of  the  mosquito  and  DrosophiTa 
Hsp70  genes  suggests  homology.    However,  it  is  possible  that  the 
divergent  gene  arrangements  are  due  to  convergent  evolution,  and  that 
the  mosquito  genes  actually  have  a  closer  relationship  to  the  D. 
meTanogaster  cognates.    In  this  chapter,  this  possibility  will  be  tested 
using  the  available  DNA  sequences. 

A  second  area  considered  will  be  the  genetic  mechanisms  that 
account  for  the  concerted  evolution  of  the  Hsp70  genes.    Extremely  high 
similarity  of  the  members  of  the  D.  melanogaster  Hsp70  gene  family 
suggests  that  concerted  evolution  has  occurred.    The  proposed  mechanism, 
gene  conversion,  is  believed  to  occur  within,  and  less  frequently 
between,  loci  (Leigh-Brown  and  Ish-Horowicz,  1981).    The  mosquito  //sp7(? 
DNA  sequences  reported  in  Chapter  3  are  similar  and  provide  another 
opportunity  to  determine  whether  concerted  evolution  is  occurring  among 
the  mosquito  sequences,  and  if  so,  whether  gene  conversion  is  an 
adequate  explanation. 

Materials  and  Methods 

DNA  Sequences.    DNA  sequences  of  the  mosquito  Hsp70  clones  p70a 
and  p70b  are  from  Chapter  3.    D.  melanogaster  Hsp70  gene  sequences  from 
locus  87A  (distal)  and  87C  (distal  gene  1)  are  from  various  sources 
(Ingolia  et  a/.,  1980)(Torok  and  Karch,  1980)(Karch  et  a/.,  1981)  as 
assembled  in  Genbank  version  60  files  DROHSP7A2  and  DR0HSP7D1 
respectively.    The  D.  melanogaster  Hscl  cognate  sequences  and  D. 
simulans  sequences  are  from  Ingolia  and  Craig  (1982).    D.  melanogaster 
cognate  genes  Hsc2  and  Hsc4  are  from  Craig  et  a/.  (1983). 


Sequence  manipulations  were  done  on  the  Genetics  Computer  Group 
(GCG)  Software  Package  (Devereux,  1984).  version  6.1,  and  the  Multiple 
Sequence  Editor  (Massachusetts  Institute  of  Technology)  both  running  on 
a  MicroVAX  II  computer.    Most  alignments  were  visual  but  the  carboxyl 
end  of  the  Hsp70  genes  were  aligned  using  the  GCG  program  BESTFIT. 

Sequence  Comparisons.    Dot-plot  comparisons  were  done  using  the 
computer  program  D3H0M  (Fristensky,  1984).    For  mosquito/mosquito 
comparisons,  the  homology  range  was  10  bases  and  the  minimum  homology 
displayed  was  60%.    For  mosquito/D.  melanogaster  comparisons,  the 
homology  range  was  3  and  the  minimum  homology  displayed  was  80%.  In 
both  cases  the  scale  factor  was  0.95.    Additional  dot-plot  comparisons 
were  done  using  the  GCG  programs  COMPARE  and  DOTPLOT  (Maizel  and  Lenk, 
1981).    The  window  size  for  those  comparisons  was  21  and  the  stringency 
14. 

Nucleotide  divergence  was  determined  using  the  GCG  program 
DISTANCES.    Amino  acid  divergence  was  calculated  using  DISTANCES 
considering  identity  only.     Parsimony  analysis  of  DNA  sequences  was 
performed  using  the  computer  program  Phylogenetic  Analysis  Using 
Parsimony  (PAUP)  version  2.4.1  (Swofford  and  Maddison,  1987). 

Results  and  Discussion 

The  mosquito  clones  p70a  and  p70b  each  contain  two  genes  and  are 
derived  from  two  loosely  linked  loci.    The  D.  melanogaster  Hsp/O  genes 
are  located  at  two  very  tightly  linked  loci.    Comparison  of  these 
sequences  in  all  permutations  permits  detection  of  regions  of  similarity 
that  might  indicate  their  phylogenetic  relationships  and  functionally 


conserved  regions.    Protein -coding  DNA  sequences  were  easily  aligned  so 
that  base-for-base  comparisons  could  be  made.    However,  for  other  more 
highly  diverged  sequences,  dot-plot  comparisons  were  used  since  this 
method  allows  the  detection  of  repeated,  rearranged,  or  gapped  sequences 
that  might  be  missed  using  a  base-for-base  comparison  method. 

Moscuito  Genes  are  similar  within  loci,  but  much  less  so  between 
locL.   The  protein-coding  regions  of  the  pairs  of  divergently 
transcribed  genes  of  p70a  are  98.9%  similar  at  the  nucleic-acid  level, 
and  can  be  aligned  without  gaps  (Table  4-1).    This  similarity  extends  to 
the  untranslated  leaders  and  150  bp  upstream  of  the  TATA  box  as  well 
(Figure  4-1).    Likewise,  the  p70b  pair  are  99.9%  similar  in  the  protein 
coding  regions.    The  untranslated  leaders  are  identical,  and  the  250  bp 
upstream  of  the  TATA  box  are  very  similar. 

When  the  p70a  protein-coding  regions  are  compared  with  those  of 
p70b,  high  similarity  exists,  although  it  is  lower  than  that  observed  in 
the  intralocus  comparisons  (Table  4-1).    Sequences  upstream  of  the 
coding  regions  have  very  little  similarity.    Dot-plot  comparison  of 
these  sequences  show  that  the  similarity  is  limited  to  a  small  region 
around,  and  upstream  of  the  TATA  box. 

These  comparisons  demonstrate  that  the  pairs  of  genes  within  loci 
are  more  similar  to  one  another  than  to  those  of  the  other  locus.  The 
degree  of  similarity  of  the  p70b  pair  is  greater  than  the  p70a  pair  in 
all  regions  compared.    The  fact  that  the  limited  regions  of  similarity 
occur  in  the  regulatory  regions  of  p70a  and  p70b  suggest  that  those 
regions  that  are  shared  might  be  a  result  of  functional  constraints 
rather  than  a  common  recent  ancestor  or  information  exchange. 
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In  contrast,  the  D.  melanoaaster  genes  are  similar  not  only 
within,  but  between  loci.    The  nucleic-acid  distance  matrix  and  dot-plot 
comparisons  show  that  the  87C  distal  gene  1  and  87A  distal  gene  are 
highly  similar  not  only  within  the  coding  regions  but  for  approximately 
300  bases  upstream  of  the  TATA  box.    Similar  comparisons  have  been 
reported  and  the  evolution  of  that  gene  family  has  been  discussed 
extensively  (Artavanis-Tsakonas  et  a/.,  1979)(Moran  et  a/,,  1979) (Ish- 
Horowicz  et  a/.,  1979){Ish-Horowicz  and  Pinchin,  1980) (Leigh-Brown  and 
Ish-Horowicz,  1981) (Mason  et  al . ,  1982).    A  close  relationship  of  all  of 
the  members  of  the  D.  melanogaster  Hsp/O  gene  family  has  been  observed. 
This  is  in  contrast  to  the  interlocus  differences  of  the  A.  albimanus 
genes  which  appear  almost  totally  unrelated  outside  of  the  protein- 
coding  regions. 

The  A.  albimanus  Hsp70  genes  share  features  with  those  of  D. 
mel anopaster.    Particularly:  the  heat-shock-element  arrays,  long 
untranslated  leaders,  divergent  transcription  arrangement,  and 
intronless  structure.    The  nucleic-acid-sequence  similarity  is  about 
75%,  and  the  predicted  protein-sequence  identity  is  82%. 

Nevertheless,  dot-plot  comparisons  of  regions  upstream  of  the 
protein-coding  regions  reveal  no  extensive  regions  of  sequence 
similarity  (data  not  shown).    Note  also,  although  A.  albimanus  and  D. 
melanogaster  are  both  Diptera,  the  predicted  Hsp70  protein  sequences  are 
as  dissimilar  as  those  of  the  nematode  C.  elegans  from  chicken  (Table  4- 
2).    Although  Hsp70  genes  are  generally  well  conserved,  surprisingly 
high  divergence  has  occurred  between  A.  albimanus  and  D.  melanogaster. 

Parsimony  analysis  of  the  //so/O-like  gene  family.    Relatively  high 
nucleotide  and  protein  sequence  divergence  suggests  that  in  spite  of 


apparent  homology,  the  D.  melanogaster  Hsp70  genes  may  not  be  the 
closest  mosquito  relatives,  but  that  the  mosquito  genes  might  be  more 
closely  related  to  the  D.  melanogaster  cognate  genes.    Proliferation  and 
divergence  of  the  dipteran  //sp70-like  gene  family  may  have  occurred 
prior,  or  subsequent  to,  nematocera-brachycera  divergence.  The 
availability  of  limited  sequence  data  for  the  protein  coding  regions  of 
the  cognate  genes  of  D.  melanogaster  and  D.  simulans  permits  this 
possibility  to  be  examined. 

In  order  to  determine  the  relationship  of  the  mosquito  Hsp70  genes 
to  the  family  of  D.  melanogaster  and  D.  simulans  genes,  Wagner  trees 
representing  the  most  parsimonious  reconstruction  of  the  gene 
phylogenies  were  constructed  using  PAUP.    The  algorithm  attempts  to 
reconstruct  ancestral  gene  sequences,  and  their  relationships,  by 
minimizing  the  number  of  changes  that  would  be  required  to  account  for 
the  extant  genes,  in  this  case,  using  nucleotide  sequences.    By  this 
f  means  a  gene  phylogeny  is  inferred  (Swofford  and  Maddison,  1987) (Fitch, 

1977). 

The  gene  phylogeny  is  based  on  the  nucleic  acid  sequence  of  306 
nucleotides  of  the  amino  terminus  of  the  protein-coding  regions  of  two 
D.  melanogaster  Hsp70  genes  and  three  cognates,  a  D.  simulans  cognate, 
and  four  A.  albimanus  HspJO  genes  (Figure  4-4).    These  are  the  first  306 
bases  of  all  but  the  cognates.    The  first  three  codons  of  those  genes 
are  deleted  for  maximal  alignment  (Ingolia  and  Craig,  1982). 

The  tree  demonstrates  that  the  genes  most  closely  related  to  the 
A.  albimanus  Hsp70  genes  are  the  D.  melanogaster  Hsp70  genes. 
Therefore,  the  A.  albimanus  genes  are  probably  most  immediately  derived 
from  a  common  Hsp70  ancestor  and  not  of  the  cognates.    Similarly,  the 


close  relationship  of  the  D.  simuTans  cognate  to  Hscl  of  D. 
melanogaster,  shows  the  probable  homology  of  this  pair. 

This  phylogeny  also  confirms  that  concerted  evolution  is  occurring 
in  the  heat  shock  gene  family  of  the  mosquito  in  a  similar  fashion  to 
that  observed  in  D.  melanogaster.    This  is  evident  since  the  pairs  of  A. 
albimanus  Hsp70  genes  cluster  away  from  all  Drosophila  sequences. 
Furthermore,  the  branch  lengths  that  separate  the  members  of  pairs  of 
genes  are  small  relative  to  those  that  separate  pairs  from  one  another 
or  from  the  nearest  D.  melanogaster  cluster.    This  indicates  that 
concerted  evolution  is  occurring  at  two  levels:  species,  A.  albimanus 
vs.  D.  melanogaster;  and  locus,  p70a  vs.  p70b.    The  analysis  also  shows 
that  the  p70b  genes  are  more  closely  related  than  the  p70a  genes  to  the 
D.  melanogaster  Hsp70  genes.    This  is  not  indicated  by  simply  examining 
the  nucleotide  divergence,  but  is  clarified  by  the  parsimony  analysis. 

Has  a  restriction  site  been  conserved  for  2QQMY?    Among  Drosophila 
spp.,  Hsp70  genes  are  highly  conserved  so  that  restriction  map 
comparisons  are  possible  for  phylogenetic  analysis.  Particular 
significance  has  been  attached  to  sites  of  one  restriction  enzyme, 
Xba  I,  in  confirming  the  Hsp70  phylogeny  (Leigh-Brown  and  Ish-Horowicz, 
1981).    A  pair  of  Xba  I  sites  are  highly  conserved  in  the  intergenic 
promoter  regions  of  the  D.  melanogaster,  mauritiana,  and  simulans  genes, 
but  not  in  teisseri,  yakuba,  or  erecta. 

Examination  of  the  restriction  map  of  the  A.  albimanus  genes  of 
p70b  shows  that  the  central  Xba  I  sites  are  in  similar  locations. 
Recall  that  the  genes  of  this  clone  also  cluster  closer  to  the  D. 
melanogaster  genes  in  parsimony  analysis.    Are  these  restriction  sites 
conserved  due  simply  to  common  ancestry,  or  is  there  an  additional 
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reason  for  their  occurrence  in  similar  locations?    The  latter  is 
certainly  the  case  since  these  sites  occur  in  heat-shock  elements  that 
have  functional  significance  in  the  regulation  of  the  heat-shock  genes. 
Heat  shock  elements  (CTNGAANTTCNAG)  frequently  overlap  by  four  bases. 
This  results  in  the  sequence  CTNGAANTTCNAGAANTTCNAG  (overlap 
underlined).    The  four  overlapping  bases,  and  one  flanking  on  either 
side  constitute  five  of  the  six  bases  of  an  Xba  I  site,  TCTAGA.    Xba  I 
sites  do  occur  with  high  frequency  in  most  HSE.    For  example,  of  26  HSE 
(included  in  468  bases)  compiled  in  one  source  (Pelham,  1985),  there 
were  eight  Xba  I  sites  and  eleven  more  sites  that  only  differed  by  one 
base.    Therefore,  Xba  I  sites  should  be  considered  with  caution  in 
phylogenetic  analysis. 

Gene  conversion  in  A.  albimanus  Hsp70  genes.    Like  most  Drosophila 
spp.,  A.  albimanus  contains  two  pairs  of  Hsp70  genes  in  divergent 
orientation.    Concerted  evolution  of  the  genes  within  each  clone  is 
certain  since  the  left  and  right  genes  are  similar  to  one  another  not 
only  in  positions  under  potentially  strong  selection  such  as  first  and 
second  positions  of  codons,  but  also  in  the  third  positions,  in  the 
nontranscribed  leaders,  and  in  the  promoter  regions  as  well.    There  are 
several  explanations  for  how  this  high  sequence  similarity  could  arise. 

First,  the  pairs  may  be  the  result  of  duplication  events  recent 
enough  that  little  divergence  has  occurred  between  the  pairs.  However, 
the  presence  of  2  pairs  of  Hsp70  genes  in  all  Drosophila  spp.  examined 
suggests  that  this  gene  number  and  general  arrangement  predates 
nematocera-brachycera  divergence.    If  true,  in  the  absence  of  some  post- 
divergence  homogenizing  mechanism,  divergence  within  loci  would  be  as 
great  as  that  observed  between  loci  or  between  species. 
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Second,  the  observed  sequence  homogeneity  could  occur  by  gene 
conversion  within  loci  utilizing  the  palindromic  nature  of  these 
sequences  (Leigh-Brown  and  Ish-Horowicz,  1981).    The  gene  conversion 
model  is  at  odds  though  with  observations  regarding  p70a.    A  conversion 
model  predicts  that  once  conversion  initiates,  it  will  proceed 
unidirectionally  until  termination  of  the  event  regardless  of  the 
function  of  the  sequences  between  the  initiation  and  termination  sites. 
Selection  will  then  preserve  conversion  events  that  are  selectively 
advantageous.    The  divergently  transcribed  pair  of  Hsp70  genes  at  the 
87A  locus  of  D.  mel anogaster  have  sequence  similarity  extending  about 
200  bases  downstream  of  the  translation  stop  signal  indicating  that 
conversion  events  have  terminated  or  initiated  beyond  the  end  of 
translation.    However,  if  gene  conversion  were  occurring  at  the  mosquito 
p70a  locus,  why  does  sequence  similarity  disappear  immediately  after  the 
translation  stop  signal,  yet  a  few  bases  away,  the  sequence  is  identical 
(Figure  4-4)?    This  means  that:  1.  this  conversion  event  initiated  or 
terminated  precisely  at  the  translation  stop  codon,  2.  there  is 
extremely  strong  codon  usage  selection,  or  3.  there  is  selection  against 
conversion  extending  downstream  beyond  the  translation  stop,  none  of 
which  seems  likely.    Another  alternative  is  that  conversion  will  not 
extend  beyond  well -paired  regions  of  DNA  and  that  once  sequence 
divergence  becomes  sufficient  at  the  3'  ends,  a  barrier  exists  to 
conversion  extension. 

Third,  occasional  snapback  structures  might  occur  within  DNA 
molecules  annealing  the  similar  sequences  of  both  genes  of  a  pair. 
Mismatches  could  then  be  corrected  by  a  general  DNA  mismatch  repair 
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mechanism.    This  would  provide  an  opportunity  for  intergenic  sequence 
exchange  and  allow  homogenization  of  the  sequences. 

Gene  conversion  is  suspected  for  Drosophila  spp.  genes  not  only 
within,  but  also  less  frequently,  between  loci.     The  A.  albimanus  p70a 
and  p70b  untranslated  leaders  and  promoter  sequences  have  retained  no 
clear  homology  and  so  are  unlike  the  equivalent  D.  melanogaster 
sequences  which  are  highly  similar  to  each  other  (Figure  4-1,  plots  C 
and  D).    Thus  there  is  no  evidence  that  frequent  interlocus  conversion 
is  occurring  between  the  p70a  and  p70b  loci.    This  absence  of  interlocus 
gene  conversion  in  A.  albimanus  may  be  a  result  of  the  greater  distance 
separating  the  genes.    The  A.  albimanus  genes  are  more  closely  related 
to  one  another  at  the  nucleotide  and  amino  acid  levels  than  to  the  D. 
melanogaster  genes,  but  this  may  be  due  to  parallel  evolution  of  the 
genes  at  the  two  A.  albimanus  loci  rather  than  interlocus  gene 
conversion. 
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CHAPTER  5 
CONCLUSIONS 


The  combined  data  of  Southern  and  in  situ  hybridizations  indicate 
that  the  clones  isolated  represent  the  entire  set  of  Hsp70  and  Hsp83 
genes  of  the  mosquito  A.  albimanus. 

The  A.  albimanus  Hsp70  genes  share  many  induction  and  organization 
characteristics  with  the  Drosophila  spp.  homologues  yet  have  important 
differences.    This  is  observed  particularly  in  the  temperature  at  which 
induction  is  maximal,  the  diversity  of  the  leader  and  promoter  regions, 
and  the  quality  and  abundance  of  heat  shock  elements.    A  productive 
pursuit  might  be  to  determine  whether  the  observed  diversity  results  in 
a  variety  of  temporal,  tissue-specific,  and  heat-related  transcription 
patterns.    Such  information  could  easily  be  obtained  by  taking  advantage 
of  the  dissimilar  leader  sequences. 

The  number  and  quality  of  HSE  in  the  Hsp70  promoters  suggest  that 
they  may  provide  superior  expression  of  hybrid  genes  in  mosquitos. 
However,  whether  the  Hsp70  promoters  will  prove  to  be  useful  for  genetic 
engineering  will  be  determined  only  by  testing  hybrid  genes  in  cultured 
cells  or  transformed  insects. 

These  analyses  indicate  that  the  A.  albimanus  and  D.  melanogaster 
Hsp70  genes  are  true  homologues.  The  available  evidence  indicates  that 
the  A.  albimanus  genes  are  undergoing  intra-  but  not  interlocus  gene 
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conversion.    The  A.  albimanus  Hsp70  genes  are  more  diverse  than  those  of 
D.  melanogaster  although  the  general  organization  is  similar  to  that  of 
most  Drosophila  spp.    The  data  collected  here  also  provide  an  additional 
opportunity  to  test  the  gene  conversion  model  of  concerted  evolution  and 
have  revealed  a  novel  manifestation  of  high  specificity  in  the  extent. 

I  have  demonstrated  that  thermotolerance  can  be  induced  by  heat 
shock.    However,  the  temperature  at  which  induction  of  the  mosquito 
Hsp70  genes  occurs  suggests  that  they  are  not  responsible  for  the 
induced  thermotolerance  observed  in  those  experiments.    This  is 
consistent  with  observations  that  the  small  heat  shock  proteins  are 
probably  responsible  for  increased  thermotolerance  in  D.  melanogaster 
(Berger  and  Woodward,  1983). 

The  preliminary  information  regarding  the  Hsp83  gene(s)  indicate 
that  this  locus  is  more  complex  than  that  of  D.  melanogaster  in  that  it 
contains  two  //sp83-similar  sequences.    It  will  be  interesting  to 
determine  whether  the  mosquito  gene  contains  introns  and  whether  this 
affects  the  expression  of  this  gene  at  high  temperatures.  The 
transcription  data  collected  are  very  preliminary.    However,  the 
availability  of  the  clones  that  I  have  isolated  will  allow  further 
characterization  with  relatively  little  effort. 
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